[02:07:09] <Mastema> anyone here have experience using mongo-c-driver in c++?
[02:16:16] <Boomtime> @Mastema: if you have a question about it, just ask
[02:16:40] <Boomtime> also, i think the latest C++ driver (non-legacy, non-2.6-compat) is just a wrapper on the C driver
[03:29:34] <Mastema> i'm getting incomplete type "mongoc_client_t" error with a forward declaration error. I'm starting to think it's a linking issue or related to the compiler itself. The code is fine, the library is installed fine, but it can't find the definition for the struct "mongoc_client_t" it seems....
[06:49:04] <kurushiyama> jokke: Walked an extra mile http://hastebin.com/fomidehizi.coffee
[06:51:07] <kurushiyama> jokke: roughly 6420 docs/sec on a sharded cluster running on my 4GB MBA, with another mongod instance running plus https://www.dropbox.com/s/aghq1ddv4lkkjmi/Screenshot%202016-05-04%2008.43.14.png?dl=0
[06:51:53] <kurushiyama> jokke: Not optimized for load distribution, single threaded insert. Your turn ;)
[06:54:07] <kurushiyama> jokke: 45MB index size with prefix compression, btw.
[07:13:57] <kurushiyama> jokke: Oh, and restoring a panel via aggregation takes between 5 and 7 msecs ;)
[07:15:00] <kaushal> I have a question regarding Monitor Disk I/O activity using iostat and iotop. when i run those tools i see some statistics like Disk read and write for example 5.65 M/s and 5.29 M/s. My question is how do i figure it out what is the best and optimized number. should i it be 10.3 M/s or it should be 2 M/s.. or whatever number .... in case of Mongo NoSQL DB
[07:17:57] <kurushiyama> kaushal: That is an _entirely_ different question.
[07:19:35] <kurushiyama> kaushal: A) you should know your hardware or b) be able to trust your hardware vendor. If the latter does not apply, I'd change right away, without any further ado.
[07:22:15] <kurushiyama> kaushal: Can I take it for granted that you are running on some sort of cloud environment?
[07:22:44] <kurushiyama> kaushal: And you ordered SSDs?
[07:23:20] <last_staff> given http://sqlfiddle.com/#!9/14f0f
[07:23:52] <last_staff> which table would be a better one to keep as a top document in mongo?
[07:24:16] <last_staff> or rather, is mongifying a valid choice to begin with, given that I want to add categories, products and stores separate from the shopping list?
[07:24:26] <kurushiyama> last_staff: THat _heavily_ depends on you use cases.
[07:24:32] <last_staff> i have to admit nosql is pretty new to me, so I'm not necessarily sure what goes and what doesn't
[07:24:34] <kurushiyama> last_staff: ALL of your questions
[07:24:44] <last_staff> like, could e.g., categories document be both its own document while at the same time being referenced in, say, a products document?
[07:25:23] <kurushiyama> last_staff: Wait a sec, please.
[07:27:04] <last_staff> "As a user, I want to add an entry in shoppinglist, which will reference categories, stores, or both"
[07:28:31] <last_staff> and a bit futuristic "As a user, I want to add a new entry in products, while adding an entry in shoppinglist"
[07:32:13] <last_staff> for the latter, the scenario would go somewhat like so:
[07:32:13] <last_staff> add shoppinglist entry -> new shoppinglist row appears -> products field combo box choice 'add new product' chosen -> products entry 'dialog' appears -> when valid info has been completed, product name appears in the field -> when clicking 'add', both product and shoppinglist entry are added to db
[07:32:56] <kurushiyama> Ah, user stories! Great point to start from, actually
[07:33:29] <kurushiyama> First of all: do you have a fixed set of categories?
[07:33:42] <last_staff> yeah, I'm a hypocrite when it comes to that. I keep saying '
[07:33:59] <last_staff> do user stories' - and never do the same myself
[07:34:17] <last_staff> no, it's the same as products, actually
[07:34:31] <kurushiyama> Ok, good news: no collection for them, then.
[07:35:36] <last_staff> so I'm immediately seeing the issue of potentially multiple dialogs when adding a new stuff to the list, which I'm thinking will be a very common scenario in the beginning
[07:35:55] <kurushiyama> last_staff: Lets talk general, first.
[07:37:14] <kurushiyama> last_staff: I found that the best approach to model data for Mongodb is to take the user stories, derive the views you want to do for them, identify the data you need for the views and then model your data accordingly, optimized for the most common user stories.
[07:37:29] <kurushiyama> last_staff: Lets do that later.
[07:38:36] <last_staff> it's actually a lot like when I work with test cycle documentation
[07:39:08] <kurushiyama> So, as I guess, you are doing a shopping list, and the items should be listed together with the place where you would buy each item, grouped by place, I presume?
[07:39:29] <last_staff> where the test cycle prerequisites map directly to 'data'
[07:39:58] <last_staff> yeah, that's where that table I forgot comes in
[07:40:58] <last_staff> "As a user, I want to order a shopping list according to a store's predefined chronological layout"
[07:41:30] <Mastema> I basically compiled a mongoc example inside of an fcgi program. When I trying to run it, I get an assertion failure and it aborts on the bcon_new() call... "Assertion `type == BCON_TYPE_UTF8' failed."
[07:42:01] <kurushiyama> last_staff: Ok. Well, that makes it a bit more complicated than I thought. Agree to keep it easy for starters?
[07:42:17] <last_staff> basically, if store A has fruits in the section where you first enter the store, those are the ones that go at the top of the list
[07:42:55] <kurushiyama> last_staff: I get the picture.
[07:43:23] <last_staff> kurushiyama: but sure, I'm after the nosql mindset to begin with, after all
[07:44:13] <kurushiyama> last_staff: Ok, lets define "locations" as stores for now, and say you want to go to the butcher and a groceries store, for example.
[07:46:10] <kurushiyama> last_staff: You are planning to have the tags in the items and the stores to associate items to stores?
[07:47:24] <kurushiyama> Mastema: Never worked with the c driver, sorry.
[07:48:31] <kurushiyama> last_staff: for example {item:"lamb chops",tags:["meat","lamb"]} and {store:"Superbutch",tags:["meat","lamb"]}?
[07:48:37] <last_staff> kurushiyama: it should be possible, yes. At the same time, it should also work as a regular old-fashioned unordered shopping list
[07:49:17] <last_staff> wonderful, now I'm hungry for lamb...
[07:50:16] <last_staff> am I seeing two separate documents?
[07:51:21] <kurushiyama> last_staff: So when entering "lamb chops" to the list, you should get a selection of shops which have one or more of the lamb chops' tags, so that the shopper can choose where to buy the stuff (Superbutch might be better than Porkmaster for lamb chops)
[07:52:44] <last_staff> it's actually tag-centered, then
[07:53:11] <kurushiyama> last_staff: That was _my_ idea. You can make it shop-centered, too. Might be easier in the beginning.
[07:54:10] <last_staff> and then derive my own setup from there, gotcha
[07:54:16] <kurushiyama> last_staff: What we do know by now, however, is that we want the following things to be displayer: item name, qty and where to buy it.
[07:55:22] <kurushiyama> plus, an item belongs to a distinct shopping list, which is the known factor.
[07:55:24] <last_staff> right, which in my case is the section of a store
[07:57:13] <kurushiyama> so out first approach for a data model for list items would be {_id:new ObjectId(), item:"lamb chops", qty: "enough for last_staff and kurushiyama", store:"Superbutch"}
[07:58:45] <kurushiyama> However, the shopping list would be missing, so we expand to {_id:new ObjectId(), list:someIdReference ,item:"lamb chops", qty: "enough for last_staff and kurushiyama", store:"Superbutch"}
[07:59:37] <last_staff> go away, stupid smileyface
[08:00:18] <kurushiyama> ;) But it is clear so far? Question: why did we reference the list, but not the store?
[08:03:56] <last_staff> ...to be able to have that same item in multiple lists?
[08:06:48] <kurushiyama> last_staff: No. That item is tied to the list by id reference. It is simpler: The question you basically have is: "For a _given_ list, what are the items on it?" So the list is given. If we referenced the store, however, we would have to look it up when displaying the shopping list – since there are no JOINS in MongoDB, this is rather awkward and you want to avoid it. Data redundancy helps us to reduce the amount
[08:06:49] <kurushiyama> of queries necessary to display the full list.
[08:07:52] <kurushiyama> last_staff: And, let us be honest: Nobody cares how the shop was called last year, even in case the name changes.
[08:08:56] <kurushiyama> last_staff: Clear so far?
[08:11:20] <last_staff> I'm still kinda processing that redundancy part
[08:11:54] <last_staff> it's sort of....a positive 'more is less'
[08:12:48] <kurushiyama> last_staff: Redundancy is not a bad thing nowadays. Disk space is cheap, with wiredTiger's compression even cheaper/doc.
[08:13:08] <kurushiyama> last_staff: Performance, however, is hard to achieve.
[08:13:32] <kurushiyama> last_staff: Especially consistently in a large scale.
[08:13:39] <last_staff> well it's not the data usage I'm concerned about; I'm thinking about if you want to remove an item completely
[08:13:53] <kurushiyama> last_staff: We come to that later.
[08:15:14] <kurushiyama> last_staff: The thing is that this collection of items is not too ideal. For example, how to deal with outdated items?
[08:16:42] <kurushiyama> last_staff: So let us think of actually modelling a shopping list: {_id:new ObjectId(), date: new ISODate(), owner:userRef, items:[{item:"lamb chops", qty: "enough for last_staff and kurushiyama", store:"Superbutch"},...]}
[08:17:22] <kurushiyama> last_staff: Forgot a name, but you get the picture, I suppose.
[08:17:59] <last_staff> a bit rough on the edges, but mostly following, yes
[08:27:36] <kurushiyama> So, how would we get said list? A user qould create one, select lamb chops, gets the selection of Superbutch or Porkmaster, select superbutch and schwuups we have our first item for the list. Rinse and repeat.
[08:29:44] <kurushiyama> Now, the user would want to see his shopping list. All you have to do now is to load the single list. _One.single.doc_
[08:31:52] <kurushiyama> last_staff: This is called embedding. There are some drawbacks on this (http://blog.mahlberg.io/blog/2015/11/05/data-modelling-for-mongodb/), but none of them apply here (unless we are talking of shopping lists containing hundreds of thousands of items.
[08:33:01] <kurushiyama> last_staff: Clear so far?
[08:34:15] <last_staff> hold on, skimming through the link
[08:35:06] <kurushiyama> last_staff: "Conclusion" is enough for now.
[08:37:29] <last_staff> yeahok, I don't foresee huge treed of subdocuments either
[08:38:24] <kurushiyama> So, basically what you do is to construct a shopping list containing everything you need to display it properly from (predefined) items and the places you get them.
[08:39:21] <kurushiyama> last_staff: Need a break, haven't had breakfast, yet.
[08:39:48] <last_staff> I have to head out soon, but thanks for the help so far
[08:40:35] <kurushiyama> last_staff: as for array manipulation, have a look at $push, $pull and probably $elemMatch
[08:41:54] <last_staff> oh yeah, that db.document.find({$push {arraystuff}})
[08:42:19] <kurushiyama> last_staff: As for the ordering, I'd probably do something like adding an index field to the places, so you'll have {item:"Apples", store:"Superfood", place:1}
[08:43:05] <kurushiyama> last_staff: You can not push in a find. It'd be db.collection.find(query,{$push:{items:newItem}})
[09:37:59] <Derick> it's not a mgo thing, it's a protocol thing. Every driver should have that implemented
[09:38:18] <kurushiyama> Derick: Ah, did not know that. But then, I do no maintain a driver ;)
[09:38:23] <Derick> first batch is by default 101 documents, then in 4MB increments (if possible)
[09:43:27] <kurushiyama> I came along a question these days, I can not remember. If a chunk is in migration, how are updates for a doc handled contained in the chunk?
[09:44:17] <Derick> kurushiyama: I'm going to have to decline to answer that as I don't know
[09:48:12] <kurushiyama> Derick: No problem. There is another interesting thing I can not remember: In a sharded collection, if you do db.coll.update({_id:someId},{$set:{designatedShardKey:"newval"}}), as far as I get it, the update would still have to be done on the old document. Now how is that document moved to the proper shard for "newval"? ;)
[09:50:49] <kurushiyama> Or would it result in a single document chunk with the according key range on the config?
[10:06:57] <Keksike> Hey, our Clojure program uses mongo java-driver 2.x, and I'm updating it to use 3.x. Something breaks in the program but I'm not quite sure what. We use bulk-search -operations, could that be the case?
[10:24:01] <mk1> hello there. I'd like to know if it's possible to update a field to the value of another field divided by 10
[10:24:59] <mk1> or 1000 for that matter. for performance reasons I need the time in milliseconds and also in seconds. now I update a document's time and I'd like to increment the milliseconds field and then update the seconds field based on the milliseconds field
[10:27:12] <BurtyB> mk1, iirc you'd need to do it in javascript
[10:42:49] <jokke> kurushiyama: i added flat_bulk to my benchmarks but i must say i'm not liking what i'm seeing.. :/ It distributes fairly well with _id.panel_id as the shard key but the write performance is very poor despite of bulk inserts of 100 docs at a time (i can't go higher than that, since in the real world we would get the samples as bulks of 100 or less as i said before)
[10:44:07] <jokke> right now i'm inserting 1 hour worth of data for 100 panels with 100 data sources each and it takes longer than 1 hour (seems to take more than two hours)
[10:45:29] <jokke> the total amount of docs for this time period is 18,000,000
[10:47:51] <jokke> the index is also frightenly large despite of prefix compression: the benchmark is 20% through so the collection currently holds about 15mins worth of data. The index is already close to 100M
[10:48:05] <pamp> My question is How to block index cretion in foreground
[12:44:35] <Secret_> Can anyone help me real quick? http://stackoverflow.com/questions/37026833/mongodb-rearrange-items-in-nested-array?noredirect=1#comment61604234_37026833
[12:44:44] <Secret_> How do I rearrange my nested array?
[12:47:36] <StephenLynx> I would just get the current value, manipulate and use a $set to replace it
[14:14:29] <zylo4747> If anyone can help me troubleshoot this intermittent crashing issue I'm experiencing I'd appreciate it. Here's the post on the mongodb-user community forum https://groups.google.com/forum/#!topic/mongodb-user/9ady1k9HLpc
[14:17:06] <zylo4747> Also, can anyone tell me if the journal file is replicated between replica set members?
[14:17:23] <kurushiyama> zylo4747: No to the last question.
[14:17:53] <cheeser> wouldn't make sense to replicate the journal file
[14:18:25] <zylo4747> ok, yeah I didn't think it was, just making sure. thanks
[14:18:56] <zylo4747> i'm just trying to figure out how this crash is happening among different nodes in my replica set. the only thing i can think of is that the database is corrupt
[14:24:06] <zylo4747> If I wanted to reset the journal file on the replica set member, what's the best way to do this? Should I remove the node from the replica set, delete the data directory, then re-add it and let it sync?
[14:24:23] <Derick> why do you want to reset the journal file?
[14:25:14] <zylo4747> because i'm seeing that at some point, the data volume ran out of space and since then we've been seeing these intermittent crashes. I am not certain, but i have a hunch something in the journal file got corrupted.
[14:25:16] <kurushiyama> Mixing up journal and oplog?
[14:25:34] <zylo4747> i'm honestly not sure. i'm grasping at straws to find out what's going on with my replica set
[14:26:05] <zylo4747> the errors always reference [journal]
[14:48:56] <zylo4747> ok, apparently my whole problem is disk space. I did a full analysis of the logs and i'm seeing just before each crash it's disk. I will work with the storage team. thanks everyone
[15:14:49] <Lujeni> Hello - there is a special process to remove an old shard database? or the dropDatabase command is enought? Thanks
[15:15:28] <cheeser> dropDatabase() should be sufficient
[15:35:37] <crazyphil> I've got a connector reporting "last entry no longer in oplog, cannot recover", this only happens in one collection, and only on 1 of 3 shards, where do I go to figure out what's going on?
[15:37:07] <silviolucenajuni> Someone can recommend to me a good material to certification study. I did M101P and read the Chodorow book but I don't know if just this is all I need.
[15:41:30] <eMyller> Hi everyone. I'm trying to retrieve from a collection of albums only those having images associated. I'm trying to learn from documentation but also am in a bit of hurry... Could anyone point me something please?
[15:42:57] <eMyller> Something like `albums.find({ $having: count(album.images) > 0 })`
[16:18:43] <kurushiyama> silviolucenajuni: If you pass that, everything else is pretty much a piece of cake with icecream and cherry
[16:20:03] <kurushiyama> crazyphil: Hm, is it possible that we are talking of the busiest shard?
[16:20:28] <kurushiyama> crazyphil: And what type of connector are we talking about?
[16:25:44] <crazyphil> kurushiyama: it's a python connector that pulls data from Mongo
[16:26:00] <crazyphil> 3 other collections have no problems, but they aren't anywhere near as large as this one
[16:26:14] <crazyphil> how do I tell which shard is busyiest?
[16:27:10] <silviolucenajuni> thx #kurushiyama . I do it
[16:27:22] <crazyphil> according to rs.status() I'm pretty evenly balanced, showing 1040, 1040 and 1044 across my shards
[16:27:30] <kurushiyama> crazyphil: I am away for a second. Please pastebin the output of sh.status() and https://docs.mongodb.org/manual/reference/method/db.collection.getShardDistribution/ for that colection
[16:27:51] <kurushiyama> crazyphil: meanwhile, be back in 20
[16:55:57] <kurushiyama> zsoc: Just ask ;) Personally, I might not be of much help, though
[16:56:44] <zsoc> So the mongo driver has update+upsert which lets you add new or update if exists without throwing an error
[16:57:17] <zsoc> But I would like to do that with a request formulated like an insert (or Mongoose's .Create), where I can just give a whole array of documents for the upserting
[16:57:46] <zsoc> Do I need to iterate them in the app and hand them to mongo one at a time?
[16:59:09] <kurushiyama> zsoc: Look out for bulk operations. Not sure wether Mongoose supports that.
[16:59:41] <zsoc> Well, it sort of does. Create() is "bulk" but I've discovered it isn't really, it's running individuals saves which is why it's slow-ish
[17:00:00] <zsoc> I know mongo's driver can do bulk via 'insert' but it's just one big call, which is why it errors on duplicates
[17:00:37] <kurushiyama> zsoc: You are kidding, right?
[17:01:01] <zsoc> I'm... new to this nosql magik. I believe things when I read them on stackoverflow
[17:01:29] <kurushiyama> zsoc: And actually, you can do bulk upserts https://docs.mongodb.org/manual/reference/method/Bulk.find.upsert/#Bulk.find.upsert
[17:01:44] <kurushiyama> zsoc: I believe that too.
[17:02:16] <kurushiyama> zsoc: That makes mongoose even more... ...discussable.
[17:03:04] <zsoc> So really my question is... how do I do this thing with a mongoose model that is easy to do with the mongo driver. That doesn't seem like a very useful question.
[17:05:29] <kurushiyama> zsoc: StephenLynx and me actually had a discussion a few days ago, where I was playing devils advocate for Mongoose. It took him seconds to tear me apart.
[17:11:19] <crazyphil> kurushiyama: so things shouldn't be tailing the oplog? I've found dozens of connectors all over github that do just that
[17:12:27] <kurushiyama> crazyphil: Oh, they can. NO doubt about that. What I was trying to make clear that the oplog is rotated to a point where the last entry recognized by the connector is not available any more. The oplog is a _capped_ collection.
[17:12:49] <kurushiyama> StephenLynx: Entity relationship diagrams, to be specific
[17:13:12] <StephenLynx> yeah, but can it describe something like
[17:13:40] <StephenLynx> this document has a field that is an array of objects and these objects have these fields, one of them is a string allowed to have this maximum length
[17:14:02] <crazyphil> kurushiyama: well the connector if restarted runs for a while, then stops after importing more records
[17:14:36] <kurushiyama> crazyphil: Here is my guess: It takes an oplog entry and processes it. Then. it takes the next entry
[17:15:24] <kurushiyama> crazyphil: depending on your size of the oplog and data volatility, the time needed for processing the first entry might be enough to get the oplog rotated
[17:19:16] <crazyphil> well I've got over 24hours in my oplog
[17:19:33] <crazyphil> and it was working just fine until it caught up to being in-sync with the current time
[18:15:07] <cheeser> i can't even imagine *why* you'd want that. :)
[18:19:04] <zerOnepal> I am in a trouble, upgrading from 2.6 to 3.0, I need to change storage engine from mmapv1 to perconaft for io and my collections weigh 1TB with around db.col.count() 500775006 records... so
[18:19:54] <zerOnepal> It will take days to dump the such heavy collections and restore them to new 3.0 :(
[18:20:49] <cheeser> restart your 2.6 as a replica set. add the new 3.0 node (+another replica member or an abiter). sync. decommission the 2.6 node.
[18:26:19] <zerOnepal> cheeser, I am thinking of archiving old data, like put only last 6 months data, since my application no longer old records... that way I will drop the size from 1TB to 200GB... may be then I could dump and restore... any words of wisdom for this ?
[18:30:09] <zerOnepal> cheeser can you elaborate a bit about your replica set... and things
[20:08:41] <kurushiyama> I could elborate a bit on my collection of knives, but I guess that was not meant with "things" ;)
[20:17:31] <Mastema> So I've made the switch from the mongo c driver to the c++ driver. That resolved all my issues. I no longer get the runtime assertion errors. It's probably a bug to be honest.
[21:38:27] <kurushiyama> Mastema: Would be good to file a bug report, then.
[22:04:36] <magepug> having issues trying to pull a dictkey from a dictionary in a document using the c# driver. that something i can get a bit of help with here or is this chan focused on mongodb specifically?
[23:25:48] <Keverw> Hmm. Are all those stories about there about not using MongoDB true? Some friends tell me it's bad, and some see it as being good and use it. Also I was looking into a closed source database, but decided to let the company know I wasn't interested and was going with MongoDB. So hmm. I just don't want to make the wrong choice.
[23:29:17] <Keverw> Then someone was telling me alot of the info about MongoDB being bad is outdated and it's improved since then... Wish I had some better guidance. As MongoDB looks pretty good, but I haven't at this point used it for anything in production yet.
[23:36:45] <cheeser> it's used in a lot of places with great success. it's not a perfect fit for everything (though it is for a *lot*).
[23:38:33] <Keverw> Yeah. I have seen all the clients using it on the site, plus I know some personally(well not on the site but they use it). Just hmm. I'd hate to make the wrong choice but I feel like I am wasting time researching databases and debating with myself
[23:41:45] <Keverw> One of my goals is to store a bunch of messages, be able to run it on a sharded cluster and search them. So think like searchable chat history, status messages and stuff along those lines.