pmxbot IRC Log Viewer

[00:27:01] <_blizzy_> anyone know any good modules for making models with mongo in Flask?

[00:34:18] <dacuca> can I have a replica set with 2 mongodbs 2.2 and 1 mongodb 2.4 ?

[00:35:01] <joannac> dacuca: not supported. might work, but no guarantees

[00:36:19] <joannac> dacuca: sorry, i should clarify. for the purposes of upgrading, supported. Long-term, not supported

[00:36:46] <dacuca> ok I was gonna say you said it was ok for upgrading

[00:37:01] <dacuca> thank you joannac

[00:59:13] <bros> joannac, How do I define an index that should be unique only within a subdocument?

[01:09:44] <Boomtime> bros: can you elaborate on what you mean? unique indexes only care about uniqueness between documents

[01:13:01] <dacuca> joannac: is it ok to mix 2.2.0 with 2.2.7 ?

[01:22:54] <joannac> dacuca: again, okay for the purposes of upgrading, not okay long-term

[01:24:33] <joannac> I also wouldn't be running 2.2.0 long term in any case, it's now 3(?) years old

[02:25:20] <kburkitt> NBC decided not to proceed with March Madness.... but wants to do the Religion targeting thing. GOD IS NOT ON OUR SIDE.

[02:25:52] <morenoh149> kburkitt: i no rite

[02:25:59] <kburkitt> lol wrong window

[03:38:21] <bunbury> anyone use mongo enterprise?

[03:40:04] <cheeser> what's your question?

[03:41:09] <bunbury> can you point me to the install doc

[03:41:17] <bunbury> I installed the mms

[03:41:25] <bunbury> but think more steps left

[03:44:54] <cheeser> installed the mms?

[03:46:10] <bunbury> mms agent

[03:46:15] <bunbury> Im on the mms site

[03:46:29] <bunbury> it says I have a mongo 2.66 deployed to my dev laptop

[03:46:42] <bunbury> im now updating to 2.6.7

[03:47:00] <bunbury> port 27000

[03:49:10] <bunbury> whats a good client for mongo browsing ?

[03:51:21] <cheeser> http://docs.mongodb.org/ecosystem/tools/administration-interfaces/

[03:55:27] <supershabam> I started an experiment to tail the oplog and write stat summaries automatically over 5m windows. Has anybody else worked with storing metric datapoints in mongodb?

[03:55:36] <supershabam> https://github.com/supershabam/monup

[04:02:11] <bunbury> thanks robomongo seems nice

[04:39:26] <Nick______> #mongodb Not able to load the extension on php 5.6

[04:48:18] <quadmasta> I've got a MEAN app that runs on distributed hardware. The nodes try to connect to the master and failing that they connect to their own instance. I need to, on occasion, grab the log collection from the nodes and push them into the master database. I don't know how to describe that simply enough to become a search query

[04:48:47] <quadmasta> could someone suggest some terms to use to search?

[07:02:02] <morenoh149> how can I tell from explain output whether the whole collection was scanned or not?

[07:03:02] <joannac> do a count on the collection and see how many docs are in it?

[07:06:24] <morenoh149> I'm just going to assume that because it says b-tree cursor it didn't do a full scan

[07:08:03] <therrell> Hi everyone, I have a mongo cluster and I write to it. I then immediately try and read that document I just wrote and mongo can’t find it. Is this a common issue with default write concern behavior on mongo clusters?

[07:08:22] <therrell> Here’s a gist of the read & write code using a java client: https://gist.github.com/anonymous/3768f4f276e0a5e916a0

[07:08:59] <therrell> Its worth noting this is an intermittent issue where the read operations return null

[07:09:32] <therrell> so I’m thinking my write concern isn’t set correctly for my use case of write than immediately read

[07:10:06] <therrell> Does anyone here feel they might be able to shed some light on this issue?

[07:28:19] <bo_ptz> [Please help ] mongo>>> db.objects.find({siteId: "boris"}) return infinite character '\'

[07:28:41] <bo_ptz> what is wrong?

[07:32:38] <morenoh149> is `siteId` a key in object?

[07:33:55] <Pharmboy> man it's been a long time since I been on irc

[07:34:04] <Pharmboy> can anyone help with a mongodb question?

[07:34:44] <bo_ptz> if i type db.objects.find().limit(1)

[07:34:52] <bo_ptz> { "_id" : ObjectId("54d8a60b5ec1936857071553"), "id" : "1c344b19f305aa61", "siteId" : "www.svyaznoy.ru", "userId" : "boris", "country" : "Корея, республика", "price" : "12990", "name" : "Видеодомофон Commax CDV-71AM (серебристый)"}

[07:35:13] <Pharmboy> I'm having problem with $addtoset $each

[07:36:46] <joannac> Pharmboy: ask your question instead of meta-asking? then at least you could maybe get an answer

[07:37:04] <Pharmboy> np, was just hoping someone was active, I'll post full question

[07:37:27] <joannac> bo_ptz: do you mean userId: "boris" ?

[07:38:32] <bo_ptz> db.objects.find({"siteId":"www.svyaznoy.ru"})

[07:38:35] <bo_ptz> work

[07:39:32] <bo_ptz> but db.objects.find({"siteId":"svyaznoy.ru", "userId":"boris"}) infinit \\\\\\

[07:41:07] <Pharmboy> 1st, does this look like syntax that should work?

[07:41:11] <Pharmboy> db.collection.update({ "favname": "

[07:41:11] <Pharmboy> favitem", "owner": "derek" }, { "$addToSet": { "fixtureids": { "$each": ["BATH POD END CAP SHELF", "FAUCETS 2-FUNCTION SHOWERHEAD SIGN"] } } }, function ... )

[07:42:08] <Boomtime> bo_ptz: can you please try: db.objects.find({siteId: "boris"},{_id:true})

[07:42:40] <bo_ptz> don't work this operation db.objects.find({"userId":"boris"}) not work

[07:42:52] <Boomtime> by the way, is "boris" a siteId or userId? you seem to be using both

[07:42:58] <joannac> Pharmboy: sure

[07:43:00] <bo_ptz> userId

[07:43:22] <Pharmboy> when I send that syntax via .net .. I get callback not defined

[07:43:28] <Boomtime> bo_ptz: please try: db.objects.find({"userId":"boris"},{_id:true})

[07:43:29] <Pharmboy> which usually means I'm missing a bracket

[07:43:33] <joannac> bo_ptz: when you said "not work", what does that mean? no documents returned?

[07:43:34] <Pharmboy> but I'm not missing one

[07:44:09] <Pharmboy> it tries to perform an operation on the first element in the array

[07:44:17] <joannac> Pharmboy: erm, that's not a mongodb question; it looks like a nodejs problem?

[07:44:22] <joannac> or maybe a mongoose problem

[07:44:32] <bo_ptz> { "_id" : ObjectId("54ddadd2a6eb54c485bcf200") }

[07:44:32] <bo_ptz> { "_id" : ObjectId("54d8a7685ec1936857071694") }

[07:44:32] <bo_ptz> { "_id" : ObjectId("54d8a60b5ec1936857071553") }

[07:44:32] <bo_ptz> { "_id" : ObjectId("54ddad5fa6eb54c485bcf1e6") }

[07:44:32] <bo_ptz> { "_id" : ObjectId("54d8a7615ec1936857071693") }

[07:44:32] <bo_ptz> { "_id" : ObjectId("54dda8e5a6eb54c485bcf19d") }

[07:44:33] <bo_ptz> { "_id" : ObjectId("54d8a7225ec193685707164d") }

[07:44:33] <bo_ptz> { "_id" : ObjectId("54d8a6bb5ec19368570715f4") }

[07:44:34] <bo_ptz> { "_id" : ObjectId("54d8a65c5ec19368570715ba") }

[07:44:34] <bo_ptz> { "_id" : ObjectId("54ddad6ba6eb54c485bcf1e7") }

[07:44:35] <bo_ptz> { "_id" : ObjectId("54d8a6f45ec1936857071628") }

[07:44:35] <bo_ptz> { "_id" : ObjectId("54ddafd9a6eb54c485bcf210") }

[07:44:36] <bo_ptz> { "_id" : ObjectId("54d8a6aa5ec19368570715f2") }

[07:44:36] <bo_ptz> { "_id" : ObjectId("54dda714bdd55b41b0954f6d") }

[07:44:38] <joannac> bo_ptz: ARGH

[07:44:48] <joannac> use a pastebin

[07:44:57] <bo_ptz> sorry

[07:45:21] <Pharmboy> sorry for posting javascript question in mongo forum, you are absolutely right it is not mongo

[07:45:46] <Pharmboy> but now I am on the right track

[07:51:26] <therrell> In case anyone paid attention to my question I figured it out. I’m using 2.2 java client which as a default write concern of normal (now known as unacknowledged) that’s fire and forget and provides no guarantee that the write operation completed before mongo responds successfully.

[07:52:37] <therrell> So my issue can be fixed by upgrading to the 2.10 client which has the default write concern of acknowledged or I can configure my 2.2 client to use the non-default write concern of my choosing

[07:54:08] <Boomtime> bo_ptz: your problem is most likely that there is invalid UTF-8 in a document field and the shell is failing to render it

[07:54:55] <Boomtime> bo_ptz: please understand the server does not care, it can handle whatever crazy string you push at it even if it's invalid, the only rule it has is that the first zero-byte marks the end of the string

[07:55:22] <Boomtime> bo_ptz: what other language to do you use? (php, java, c#, etc)

[07:55:34] <bo_ptz> node js

[07:56:30] <Boomtime> ok, where are you sourcing the data?

[07:56:44] <Boomtime> the other possibility is you have found a bug in the shell

[07:57:45] <Boomtime> anyway, if you want to find the document that is causing this, you can grab each one of the docs in that list one at a time by _id and see if the shell can print it

[07:58:45] <bo_ptz> Boomtime ok thank you

[08:03:28] <bo_ptz> but why I haven't problem when do db.objects.find().limit(1)

[08:04:33] <flusive> Hi, I have problem with disk space on one shard (is 98% used), I have also another shards (shard1: 50% disk space free, shard2: 20% disk space free). What can I do to increase free space on shard3 or how can I force mongo to don't used more space on shard3?

[08:05:19] <Boomtime> bo_ptz: limit(1) means you only see one document in the shell

[08:05:31] <Boomtime> that particular document apparently isn't the problem document

[08:05:40] <bo_ptz> db.objects.find()

[08:05:43] <bo_ptz> work too

[08:05:52] <Boomtime> and that prints only 10 documents

[08:06:47] <Boomtime> do you understand that the problem is just displaying the result in the shell?

[08:06:50] <bo_ptz> Boomtime, You right, when I print it in console it failed

[08:07:39] <Boomtime> you may find one document, or you may find several are a problem, you will need to play around a bit now to find out why the problem documents can't be printed in the shell

[08:07:53] <Boomtime> in my experieince, this has *always* been because of invalid UTF-8

[08:08:13] <Boomtime> it is very common, because there are just tonnes of libraries that do not handle UTF-8 correctly

[08:08:25] <joannac> mongodump and then bsondump might help? or maybe mongoexport ?

[08:09:04] <bo_ptz> Boomtime, it's not only display problem when I do it in nodejs with native driver get data from mongo , it return object with \\\\\

[08:10:25] <Boomtime> yep, because the characters are invalid

[08:10:54] <Boomtime> that's pretty much proof that the data is not valid UTF-8, that node is trying to interpret it as such and the translation comes out as garbage

[08:11:15] <bo_ptz> Boomtime, thank you ))))

[08:11:52] <Pharmboy> are mongoose questions acceptable here if I get crickets in the mongoose channel?

[08:12:59] <teoo> hello.. could anyone please check this SO question, and maybe answer, if you have experience? http://stackoverflow.com/questions/28684462/what-is-faster-to-search-in-hashmap-object-or-array-in-mongodb

[08:17:14] <teoo> Pharmboy: sure, I think :)

[08:18:20] <Pharmboy> I asked it awhile ago, it is in regards to $addtoset $each ... someone confirmed by syntax was correct, but when I pass it in node.mongoose, it seems there is a bracketing issue

[08:18:42] <Pharmboy> and node is trying to parse an element of my array as if it was a function

[08:32:19] <kakashi_> hello

[08:33:33] <kakashi_> I got a question about that mongodb couldn't be opened...

[08:44:27] <kakashi_> log: http://pastebin.com/BbFjVVrM, after I saw this log, mongodb hangs...

[08:46:32] <flok420> hi i have a problem converting a mongo query into the PyMongo equivalent. on the forum nobody knows the answer but maybe someone in this channel knows it? the posting is at https://groups.google.com/forum/#!topic/mongodb-user/32m6T_et3MY

[08:57:27] <joannac> kakashi_: "hangs"?

[08:57:43] <joannac> kakashi_: what's the full startup?

[08:58:37] <joannac> flok420: in one you have { $gt: ["$ts" , 0] }, in the other { "$gt" : [ "$ts" , max_age ] }

[08:59:02] <flok420> yes the 0 is a test returning everything, max_age should also return some

[08:59:12] <joannac> flok420: in any case, do the aggregation step by step. leave just the match part. do you get results?

[08:59:12] <flok420> in fact in my python code max_age is 0

[08:59:21] <joannac> add the unwind, do you still get results?

[08:59:32] <flok420> ok i'll try that

[09:17:39] <flok420> it looks like that it works with the match and the unwind. if i add the $redact statement then nothing is returned. I verified that max_age is 0. also with an explicit 0 (instead of that variable) I get nothing. if i replace "$ts" by "ts" then all data is returned as if the whole $gt is not executed

[09:17:54] <kakashi_> @joannac http://pastebin.com/6kph0uc5

[09:25:08] <teoo> could anyone please check this SO question, and maybe answer, if you have experience? http://stackoverflow.com/questions/28684462/what-is-faster-to-search-in-hashmap-object-or-array-in-mongodb

[09:40:05] <flok420> joannac: if I remove the redact, then I get all data in the collection. so something is wrong with that

[10:03:09] <flusive> Please help me to solved problem with free space on shard. What happen when my shard will be full? Whole sharding will crash? What can I do to increase free space on shard3 or how can I force mongo to don't used more space on shard3? I need some solution to store around 25GB/week

[10:04:01] <flusive> maybe do you know some hosting for that? i don't have own infrastructure

[10:35:03] <styles> I'm trying to organize game cards (magic etc). I want a Card object to belong to a Set. I need to search on different factors. My idea was to add the Set to each card object.

[10:35:34] <styles> Or keeping cards seperate from sets and jsut linking the ID and requried info in each card into the card object itself

[10:44:07] <joannac> styles: heh, i tried that with my MtG cards

[10:44:29] <joannac> if you want to do things like "find all cards in this set", just embed the setname/setID in the document

[10:44:54] <styles> joannac, that's what I figured

[10:45:03] <styles> joannac, I was just going to duplicate the data, it's trivial anyway

[11:53:25] <andrewjones> Hi can anyone help with a bit of backgound on mongodb

[11:53:49] <andrewjones> Just want to check the function of the arbiter and replica set members

[12:10:26] <esko> i have a collection with title: and anchor: i would need to insert a incrementing id: to every (title: anchor:) in the collection, could someone help?

[12:40:37] <andrewjones> Hi can anyone help with Mongo MMS , using it on a standalone server no internet connection as the guides imply you can do this but we cant find out what the default login credentials would be on a standalone instance of mms

[12:42:46] <StephenLynx> I'm not familiar with MMS, but isn't it the same as a default install, with no auth?

[12:43:16] <andrewjones> no we have it setup but it asks for user login details and not sure how you set this up

[12:43:52] <andrewjones> Tried to contact mongo direct as purchasing support but cant get hold of them

[13:27:20] <jayjo> What's the best way to write json data to mongodb? Would this be a bulk write operation from my language of choice and associated driver, or can I do this through mongo shell?

[13:27:55] <cheeser> well, there's mongoimport...

[13:28:18] <hayer> I have a document which looks like this http://pastebin.com/hevptGEf .. how can I select all the elements in data which has a date between X and Y?

[13:29:41] <jayjo> cheeser: thanks, that looks like it works!

[13:34:22] <cheeser> np

[13:35:23] <joannac> hayer: if you are ever in a situation where you think "I want a subset of this array" your array should be split into top-level documents

[13:38:56] <hayer> joannac: okey, so I should have a document called "data", where each item has a "item_id" that points to the ID of the document containing things like name and somevalue?

[13:39:31] <StephenLynx> yes. sometimes using a pseudo relation is better than having a sub-array.

[13:40:08] <hayer> StephenLynx: but how will this be if the data array is massive? Like million of rows.

[13:40:33] <StephenLynx> even better to have as a separate document, because BSON has a limit on the size of documents.

[13:41:10] <hayer> a separate document for each data entry? Aka each entry in the data-array should be one document?

[13:41:13] <StephenLynx> and since you can index, performance will not be much of an issue. just make sure you don't make somethat that will need multiple queries.

[13:41:27] <StephenLynx> yes, that is what we are suggesting in this case.

[13:41:50] <StephenLynx> let me link my model for one of my projects, hold on

[13:42:09] <StephenLynx> https://gitlab.com/mrseth/bck_lynxhub/blob/master/doc/model.txt

[13:42:14] <StephenLynx> look at the posts collection.

[13:42:21] <hayer> Wouldn't that mean evetually I will have millions of documents? I've been told that document = table, but with no structure set in stone.

[13:42:29] <StephenLynx> no

[13:42:30] <cheeser> what? no.

[13:42:34] <StephenLynx> a collection is a table

[13:42:35] <cheeser> collection == table

[13:42:41] <StephenLynx> a document is an entry in that collection.

[13:43:31] <StephenLynx> the posts collection used to be a sub array in the threads collection. but because I regularly had to query for just some posts in that thread and the 16mb limit, I chose to put them on a separate collection.

[13:44:19] <hayer> okey, at I've created one collection for each vechile. Each vechile has data. Each element of data should be a separate document inside the vechiles collection..?

[13:44:45] <StephenLynx> don't you mean it is one document for each vehicle?

[13:44:58] <StephenLynx> and all these documents in the same collection?

[13:45:20] <StephenLynx> anb no, these data documents should be stored in a separate collection.

[13:45:25] <hayer> typo, yes. One document for each vehicle.

[13:46:01] <hayer> Okey, so one vehicle_info collection and one vehicle_data collection?

[13:46:09] <StephenLynx> yes.

[13:46:35] <hayer> I'm starting to see how much I've misused this so far.

[13:46:52] <StephenLynx> now you either use _id or create your own unique field for each vehicle and use it to track to which vehicle each data document belongs to.

[13:47:01] <StephenLynx> since you don't have actual foreign keys.

[13:47:27] <StephenLynx> you do have field references, but in the driver they just perform multiple queries behind the scenes.

[13:48:02] <hayer> So I can then ask the vehicle_data collection to return all documents where vehicle_id is X and date is between Y and Z! This makes so much more sense.

[13:48:32] <jayjo> Is there a reason when I imported .5gb file with mongoinsert my db is now 1gb? is that standard?

[13:48:38] <StephenLynx> aye

[13:48:51] <StephenLynx> @ hayer

[13:49:05] <cheeser> there's padding to consider. indexes. etc.

[13:49:48] <hayer> thanks StephenLynx - that cleared up alot. I'll probably be back later this week with more dumb questions :)

[13:49:49] <jayjo> cheeser: is 2x the normal rate though? I'm about to do the same process with 4gb file, should I expect 8gb db after?

[13:49:54] <StephenLynx> np :v

[13:50:14] <StephenLynx> jayjo keep in mind mongo prealocates disk space.

[13:50:39] <jayjo> duh, you're right that has to be it

[13:51:58] <cheeser> most databases preallocate "extents" so that information is stored close together.

[14:04:26] <jayjo> just fyi - I just wrote the next file and it is almost exactly 2x the import file size

[14:04:53] <jayjo> so now I'm at 7.95 GB, I wonder if that does have to do with space being allocated

[14:41:02] <flusive> Please help me to solved problem with free space on shard. What happen when my shard will be full? Whole sharding will crash? What can I do to increase free space on shard3 or how can I force mongo to don't used more space on shard3? I need some solution to store around 25GB/week

[14:48:48] <bros> I have a key called "store_id" in a lot of tables. I'm finding myself having to do an extra query every time I call store_id, to verify that the store_id belongs to account_id. Would it be smarter to add account_id as well to avoid the extra query?

[14:49:18] <StephenLynx> how many extra queries you have to perform by operation, bros?

[14:49:39] <bros> I want to make sure the user isn't trying to "spoof" a store ID and gain access to somebody else's data.

[14:50:00] <bros> Every time I do a query with a store, I precede it with a query for that users account and stores

[14:50:14] <StephenLynx> I would cache the user validation data.

[14:50:26] <StephenLynx> in your application.

[14:50:57] <bros> I considered doing that but couldn't come up with a reliable way for logged-in clients to get notification of the refresh while already logged in.

[14:52:23] <StephenLynx> what kind of refresh?

[14:53:08] <StephenLynx> wait, how do you store the user a store belongs to?

[14:54:38] <bros> db.accounts = collection. db.accounts.users for users, db.accounts.stores for stores

[14:55:08] <StephenLynx> ugh

[14:55:13] <StephenLynx> yeah, you screwed up there.

[14:55:19] <bros> what should I be doing?

[14:55:30] <StephenLynx> db.users, db.stores

[14:55:45] <bros> lol, some woman in here advised me to do subdocuments.

[14:55:50] <bros> Should I be using subdocuments at all?

[14:55:52] <StephenLynx> yeah, you could do that.

[14:56:02] <StephenLynx> but since subarrays have limited functionality

[14:56:17] <StephenLynx> and you might be needing to retrieve only some stores instead of them all

[14:56:30] <StephenLynx> I would store stores on a separate collection.

[14:56:54] <bros> Cool. Thank you very much.

[14:57:21] <StephenLynx> and then put a field on each store with the user it belongs to.

[14:57:34] <bros> account*, you mean?

[14:57:47] <StephenLynx> no.

[14:58:04] <StephenLynx> do you need an array of accounts? isn't an user an account?

[14:58:15] <StephenLynx> a collection of accounts*

[14:58:17] <bros> An account can have 5 stores, and 10 users. The 5 stores belong to all 10 users of the account.

[14:58:22] <StephenLynx> aaaah

[14:58:30] <StephenLynx> that changes everything.

[14:58:33] <StephenLynx> not really

[14:59:00] <StephenLynx> yeah, I would have 3 collections then

[14:59:06] <StephenLynx> account, users and stores.

[14:59:30] <StephenLynx> the store have a field with the account it belongs to, so as users.

[15:00:13] <StephenLynx> so you can just cache the user validation data, including the account it belongs to. so you can easily check against the account the store belongs to when you retrieve it.

[15:00:50] <bros> StephenLynx, https://gist.github.com/anonymous/d4788e5330ef6338fde9

[15:02:35] <StephenLynx> yeah, you have it already as a subarray.

[15:02:56] <StephenLynx> which is valid.

[15:03:02] <bros> Does anything else look tragic there?

[15:03:24] <StephenLynx> no, I though it was before understanding what you had.

[15:03:38] <StephenLynx> I didn't know accounts were something like they are.

[15:03:44] <StephenLynx> it makes sense the way it is.

[15:04:00] <bros> What's the proper way to verify something you don't want duplicated exists? Set a unique index, try to insert, catch error

[15:04:04] <bros> or query before hand, then insert?

[15:04:16] <StephenLynx> catch duplicated error.

[15:04:20] <StephenLynx> reliable and performant.

[15:05:08] <StephenLynx> and since you are using node/io, I assume, you should be checking for the error anyway

[15:05:54] <StephenLynx> but for that you will need a separate collection, I guess. I tried to make an index for documents in a sub array and failed. but I could be wrong.

[15:07:00] <bros> Can we get anybody else in the channel to confirm?

[15:08:06] <bros> db.items.ensureIndex({ 'barcodes.barcode_id': 1 }) { safe: undefined, background: true, unique: true }

[15:08:09] <cheeser> query then insert is susceptible to race conditions

[15:08:19] <cheeser> what is "safe?"

[15:08:38] <GothAlice> Standard practice: attempt the change (insertion/update) and check for failure, handle failure. Atomic, simple, fast.

[15:09:45] <bros> GothAlice, can you have a unique index for documents in a sub array?

[15:10:07] <GothAlice> If you're dealing with sub-arrays, you can handle that as part of the update query.

[15:10:30] <GothAlice> It simply won't "match" a document that conflicts, thus nUpdated will be zero, and you can catch that.

[15:11:04] <bros> So, no need to set the index?

[15:12:23] <GothAlice> (Since "inserts" to sub-arrays are actually $push operations, they're updates. In your update, query for $ne or $not matching the potential duplicate value, and your record will either push the value (nModified=1) as there is no conflict, or conflict and not update the record. An index will speed up that check, potentially. The only way to tell if it actually helps is by telling MongoDB to explain the query with and without the index for

[15:12:24] <GothAlice> comparison.

[15:12:53] <GothAlice> The index itself shouldn't be "unique", however.

[15:13:44] <bros> GothAlice, what if it's potentially 5-10 duplicate values?

[15:13:47] <bros> I'd have to query for them before hand?

[15:14:21] <GothAlice> … no.

[15:14:35] <bros> When I insert/update then check for failure, will I be returned the ID of the culprit?

[15:14:43] <StephenLynx> no

[15:14:45] <GothAlice> Still part of your collection.update({}, {$push}) operation. No you won't.

[15:15:12] <StephenLynx> if you have an unique index, then yes. but if you just write a query that will match 0 documents, then no.

[15:16:02] <StephenLynx> you can only get the updated document with an operation that I forgot it's name, but it will only return IF it actually updates the document.

[15:17:02] <GothAlice> findAndUpdate—many caveats to use.

[15:17:12] <GothAlice> (Returns document prior to modification, amongst other goodies. The manual page has more.)

[15:17:20] <StephenLynx> not to mention you will have to just assume the query conflicted with exiting values. it could be the query parameters were wrong in the first place.

[15:18:18] <StephenLynx> using separate collections with unique indexes are the safest approach.

[15:24:24] <styles> I'm trying to organize game cards (magic etc). I want a Card object to belong to a Set. I need to search on different factors. My idea was to add the Set to each card object.

[15:25:05] <styles> Or keeping cards seperate from sets and jsut linking the ID and requried info in each card into the card object itself

[15:25:25] <StephenLynx> will you have to query for just some cards within a set?

[15:27:19] <StephenLynx> when query for cards, will you have to query for the set of said card for additional data? if you do, will you have to perform just one additional query or multiple queries depending on the number of cards?

[15:30:52] <StephenLynx> depending on what you answer, you might be better with separate collections or subarrays.

[15:30:56] <StephenLynx> styles

[15:33:51] <bros> StephenLynx, how would you store an item that sometimes has a subarray variants, but someitmes doesn't

[15:33:54] <bros> two separate collections?

[15:34:18] <StephenLynx> that depends mostly on that 2 questions I presented.

[15:34:48] <StephenLynx> not of it might or not have this list of items on a 1 to n relation.

[15:34:53] <StephenLynx> not if*

[15:35:30] <StephenLynx> if your sub array might be over 16mb is a factor too.

[15:36:44] <bros> StephenLynx, https://gist.github.com/anonymous/2da2cd4217ac6aef1061

[15:37:00] <NoOutlet> The best practice is to fit your data schemas to your application requirements. There isn't an inherent best way to store data without the application context.

[15:38:02] <bros> I simply want to be able to store/retrieve barcodes for items. Sometimes, items has variants. If that's the case, the barcodes belong to the variant, not the item. I'm not sure how to best represent that.

[15:38:41] <winem_> hi. I have some trouble understanding the rollback process of mongods in a replica set. is this right: the primary node writes the stuff into it's datafile and keeps the documents in the rollback file since it got an ack from the majority of nodes, e.g.? depending on the write concern of course

[15:39:15] <winem_> would just like to understand what happens in detail if the primary can't reach any of his slaves and a file system backup from the datadirectory is running...

[15:39:39] <jayjo> Is javascript the default language for working with mongodb?

[15:39:57] <StephenLynx> no

[15:40:07] <StephenLynx> but because of BSON

[15:40:14] <StephenLynx> using js makes lots of sense.

[15:41:41] <jayjo> but is it true that no other language is 'enabled' working directly with it in the shell? all of the examples in the docs are using js functions

[15:42:07] <StephenLynx> because of JSON.

[15:42:14] <StephenLynx> it stands for javascript object notation

[15:42:31] <cheeser> jayjo: the shell is a js shell. so, no, writing c#, say, in the shell probably wont' ever happen

[15:42:38] <NoOutlet> Well, no, the shell understands javascript. Yes.

[15:42:53] <StephenLynx> so yeah, any other language pretty much has to mimic it. you can use json on the shell client because it just parses text.

[15:44:31] <jayjo> I like js- i'm just getting a feel for this. If i'm writing mapreduce or other aggregation tasks, this is _always_ done through the shell, right?

[15:44:51] <NoOutlet> winem_, the primary gets some writes but is (for some reason) unable to send those writes to the secondaries. One of the secondaries becomes primary and takes some new writes. When the initial primary comes back into the replica set, it will see that there are new writes which it doesn't have and that the replica set doesn't have the writes that it got before the disconnection. That's when a rollback occurs.

[15:45:23] <StephenLynx> I thought the shell was only used by the shell client.

[15:45:38] <StephenLynx> I may be wrong, but I believe the driver performs all operations.

[15:46:28] <winem_> ok. so the initial primary node writes it's stuff to the datafile before it gets the ack from the secondaries or does the primary just write it to the rollback file and moves the docs into the datafile when it receives the ack

[15:46:34] <winem_> ?

[15:47:28] <NoOutlet> I don't know what you mean by the ack. The acknowledgement? or like the "ACK! I'm dying!"

[15:47:49] <winem_> the acknowledgement of the majority of secondaries, e.g.

[15:48:22] <NoOutlet> If the primary gets acknowledgment from the secondaries, then there won't be a rollback.

[15:48:45] <NoOutlet> A rollback occurs when a write has not been replicated properly.

[15:49:48] <winem_> ok. but are the documents written to the datafile or not?

[15:50:49] <winem_> if yes, it would mean that the primary writes to the datafile -> sends the documents to the secondaries and waits for their acknowledgement -> primary rolls back if it gets no acknowledgement and touchs the datafile again

[15:50:59] <winem_> or do I just think to complicate???

[15:51:21] <NoOutlet> They may or may not have been written to a datafile in the primary. But the writes will be rolled back (taken out of the datafile) and written to the rollback folder.

[15:51:31] <NoOutlet> No.

[15:51:36] <NoOutlet> That's not right.

[15:51:46] <NoOutlet> Hold on a minute.

[15:51:56] <winem_> sure, I'll be still and wait to learn

[15:55:14] <jayjo> last time I'll ask about the js - this page from the pymongo documenation has the mapreduce functions put in as js. is this unique to the python driver?

[15:55:22] <jayjo> http://api.mongodb.org/python/current/examples/aggregation.html#map-reduce

[15:55:39] <StephenLynx> no idea. never used mongo outside the shell or js.

[16:07:58] <NoOutlet> Okay. Let's say you have a three node replica set. One is the primary. It's in a separate data center from the other two.

[16:09:05] <winem_> yep, ok

[16:09:28] <NoOutlet> A network partition causes lost connections between the primary and the two secondaries. This happens immediately after a couple writes which have not been replicated to the secondaries.

[16:09:51] <NoOutlet> So now the primary has writes that the secondaries don't know about.

[16:10:32] <NoOutlet> And finding out that the secondaries aren't reachable causes an election on both sides of the partition.

[16:11:00] <NoOutlet> On the primary side, there is only one voter, so there is no majority and the primary steps down to a secondary, no longer able to take writes.

[16:11:16] <NoOutlet> On the secondaries side of the partition, there are two voters and one of them becomes a primary.

[16:12:14] <NoOutlet> If there are no writes to the new primary before the partition ends, then there will be no need for a rollback.

[16:14:26] <NoOutlet> When the partition ends, the datasets will be compared and if it's seen that the initial primary (the one in a data center by itself) has writes that the others don't, those writes will be replicated to the others then.

[16:15:24] <NoOutlet> However, if some writes are made _during_ the network partition, then when the partition ends the writes that the initial primary had will be rolled back.

[16:16:11] <winem_> ok. this was helpful for the understanding

[16:16:15] <NoOutlet> Cool.

[16:16:44] <winem_> but how does the rollback work? I saw that there is a folder called rollback in the data-directory

[16:16:47] <NoOutlet> It's irrelevant to the situation about whether it wrote to the datafile or the journal.

[16:17:24] <winem_> ok, I understand. So I just have to think a bit abstract and don't have to care if it's written to the data file

[16:18:09] <NoOutlet> Essentially, the data in the initial primary will be synced to the initial secondaries (one of which became a primary and took writes).

[16:18:59] <NoOutlet> So the initial primary will not have the writes that it took immediately before the partition, but it will have the writes that the other primary took during the partition.

[16:19:10] <NoOutlet> And it will create a rollback file to dump out the changes.

[16:19:31] <NoOutlet> This is so that if an administrator wants to apply those writes, they are available.

[16:19:51] <winem_> great, this is the point that confused me!

[16:20:45] <winem_> so the rollback file is written when the initial primary went back online again (as a primary e.g.) and rolls back some operations because the new primary did not receive them

[16:20:50] <NoOutlet> If you know much about revision control like git or subversion or anything like that, it's a similar problem.

[16:21:02] <winem_> and now, it's up to the admin to decide whether he would like to process the rollback file manually?

[16:21:24] <NoOutlet> Basically, when there are writes to two systems within a replica set, mongodb doesn't know how or if the writes can be merged.

[16:21:57] <NoOutlet> Yep.

[16:22:40] <winem_> great. so it was that "easy"... I could not explain when he writes the rollback file or if the documents are kept in the journal / data file or the rollback file at the same time, etc...

[16:22:50] <NoOutlet> I think it's not likely that the initial primary would become primary again unless it had some priority.

[16:23:24] <winem_> thank you very much. I will go in the background and play around on the dev environment to see if I got everything right now. this was very helpful - thanks for your time!

[16:23:53] <NoOutlet> Because when it comes online, if there is a rollback, that means that it needed to take writes from the new primary.

[16:23:59] <NoOutlet> Sure.

[16:24:03] <NoOutlet> Good luck.

[16:24:07] <winem_> thanks

[16:34:07] <freeone3000> I have three servers pat of an 8-member replicaset who seem to be stuck in recovery. Their state is listed as "RECOVERING", but their optimeDate doesn't seem to increase. What could cause this?

[16:34:34] <winem_> NoOutlet: great, this was very helpful! :)

[16:49:01] <Siamaster> Hi, I'm trying to chose a orm for working with java

[16:49:21] <Siamaster> I looked for morphia, but there wasn't many tutorials

[16:49:48] <Siamaster> Does anyone use any that he can recommend?

[16:50:29] <Siamaster> are orms even good when working with mongo?

[16:52:35] <domo> @Siamaster it depends on your needs

[16:53:34] <domo> we use one we wrote based on mgo - https://github.com/maxwellhealth/bongo

[16:54:11] <domo> we also don't use it in certain places

[16:55:17] <StephenLynx> I wouldn't use anything besides the driver, Siamaster.

[16:56:24] <Siamaster> domo, I was counting on having to mix

[16:56:30] <Siamaster> why StphenLynx?

[16:57:01] <Siamaster> cool, did you write bongo yourself?

[16:57:10] <Siamaster> i mean, the team

[16:57:24] <styles> StephenLynx, back sorry

[16:57:29] <StephenLynx> because they don't actually provide any functionality, they just add bloat to your project.

[16:58:09] <Siamaster> I read that morphia is gonna get merged into the driver

[16:58:11] <styles> StephenLynx, It's going to be multiple things. All cards based on a set. But also (here's the tricky part) a set can be "hidden" (not public). So I need to be able to query for all cards w/ no hidden set

[16:58:12] <StephenLynx> have been using mongo in two projects with nothing but the driver and having no issues at all.

[16:58:24] <styles> This makes me feel like I have to have the set in the collection of card

[16:58:38] <Siamaster> cool

[16:58:42] <StephenLynx> styles yeah, you might have to duplicate some data

[16:58:49] <Siamaster> yeah, Mongo is really simple

[16:58:49] <styles> Ok so the next part is

[16:58:51] <styles> I want a list of sets

[16:59:01] <styles> Would I just... keep the set seperate still too?

[16:59:11] <styles> And anytime I update the set info (name) locate all cards and update those too?

[16:59:37] <domo> eh, for the amount of functionality we get out of our bongo service, the tradeoff is negligible

[17:00:08] <domo> I mean, writing an ODM in golang vs something like PHP definitely helps

[17:00:17] <ra21vi> I am trying to build a complex aggregation query, but failing. The required aggregation and JSON data structure is at - https://gist.github.com/anonymous/78028b8b42ce2044f64b

[17:01:34] <ra21vi> Right now I have to hit multiple queries and then apply logic at app side to get the result. Need help in building aggregation query for same.

[17:02:45] <ra21vi> In given gist (https://gist.github.com/anonymous/78028b8b42ce2044f64b), the record file describes how data is stored in mongodb. In query, the required aggregation query is explained

[17:13:46] <ra21vi> can anyone please help me with the aggregation.

[17:18:21] <Siamaster> Do I gain something by using long for ids instead of ObjectId?

[17:18:39] <Siamaster> It should take less space right?

[17:40:33] <GothAlice> Indeed, a 64-bit integer would take up 4 fewer bytes. What you lose if you do that is everything, however.

[17:41:41] <GothAlice> Siamaster: MongoDB has no concept of "auto increment", which means you need to jump through hoops and you encounter many more race conditions than before: http://docs.mongodb.org/manual/tutorial/create-an-auto-incrementing-field/

[17:42:51] <GothAlice> Notably, if you have multiple servers how do you synchronize ID creation between them to prevent duplication? ObjectId doesn't suffer this problem. (In fact, each client application connection can deal with ID creation in complete ignorant bliss of any other server or application creating IDs.)

[17:43:08] <Siamaster> cool

[17:44:09] <Siamaster> and as far as mongo is concered, it doesn't matter if I use ObjectID or String in my java code right?

[17:44:23] <GothAlice> One should always use the correct type.

[17:44:36] <Siamaster> alright

[17:44:50] <StephenLynx> yeah, it checks for types.

[17:44:57] <GothAlice> If you accidentally store an _id that is the hex-encoded string representation of the ObjectId, suddenly creation-order sorting goes out the window, as does range querying them.

[17:45:05] <StephenLynx> "1" is not 1

[17:45:31] <Siamaster> ok

[17:45:51] <GothAlice> (Pro tip: ObjectId includes the creation time allowing you to $gt and $lt query them to find records in date ranges. Also, unless you need to extract date-like values from a creation time, you won't need to store an additional creation time.)

[17:46:16] <GothAlice> Rather, extract date-like values from a creation time within an aggregate query. (Things like $hour, $minute, $day, etc.)

[17:46:35] <StephenLynx> hm, I guess I need to study ObjectId more.

[17:46:43] <GothAlice> ObjectId is awesome. :)

[17:47:21] <GothAlice> StephenLynx: http://bsonspec.org/spec.html

[17:47:43] <GothAlice> Page links to the ObjectId reference.

[17:54:38] <Siamaster> Thanks GothAlice!

[17:57:04] <Siamaster> what if, you already have a unique String id entity then? Would you create an ObjectId anyway?

[17:57:21] <Siamaster> In my case, I have a FacebookUser which I get a String Id from

[17:57:46] <GothAlice> If you have external ID sources, then no worries using that unique value as the _id in your collection. Making them unique is Facebook's problem. ;)

[17:57:57] <GothAlice> Just be absolutely certain you are consistent within a collection.

[17:58:26] <Siamaster> alright, ty.

[18:02:21] <jayjo> If I have a query running a text search on an 8gb database and it's ~10 minutes in, should I be mad? ;) I didn't test it on a subset first

[18:03:55] <GothAlice> I'm not entirely sure how MongoDB's FTI search operates, however it's common to boolean search first, that way only a subset of the documents are ranked. If MongoDB does do this initial boolean search for you, then either your query is matching a whole lot more than expected or something's up. If it doesn't, then you might want to do that initial filter yourself. ;)

[18:04:30] <GothAlice> jayjo: There is the potential issue that your index doesn't fit in RAM… if this is the case then you may be there until doomsday. ;^P

[18:36:01] <eaSy60> Hello. Is there a way to create a query on the last element of an array inside a document?

[18:37:06] <eaSy60> { arr: { $elemMatch: { index: arr.length - 1 } }

[18:38:17] <GothAlice> eaSy60: There are several methods to almost do what you want. $elemMatch will let you query the contents of an array, but that only works if you store in index with the records, as you demonstrate. You could store a copy of the "last" record pushed outside the array, then it becomes trivial to get back. You could also do an aggregate query, unwind the array, and pick the $last of each of the fields when re-grouping.

[18:38:42] <StephenLynx> you can slice on find. or you can unwind the array.

[18:38:52] <StephenLynx> you can't slice on aggregate tough

[18:39:31] <GothAlice> (Your $elemMatch example also has the race condition of needing to look up the size of the array first, to determine the index of the last element.)

[18:39:35] <eaSy60> Slide and conditions in the same query?

[18:40:09] <eaSy60> GothAlice: that was an exemple, I don't want to store an index or the last record :)

[18:40:19] <eaSy60> s/exemple/example

[18:41:11] <GothAlice> Storing the last record outside the array, and $set'ing it when you $push new values to the array, while a minor increase in processing (one extra operation) will allow you to easily query (and even index) the fields for that "last" record.

[18:41:40] <GothAlice> If I needed to query them, that's what I would have done for the forums I wrote. (As it stands, I just slice during projection since when I need the last reply to a thread I just need the last reply, no query.)

[18:43:11] <eaSy60> I need to get all the documents in a collection where document.anArray[last].status === 'online', I don't need to $push any thing.

[18:43:33] <eaSy60> I writing a .find query, not a .update query

[18:43:34] <GothAlice> I'm referring to how you update that data, not query it, when I refer to pushing. Are new values ever added to anArray?

[18:44:55] <eaSy60> yes I $push values in a different application

[18:45:04] <eaSy60> the .find query is for a cron job

[18:46:53] <GothAlice> eaSy60: https://gist.github.com/amcgregor/5867b868131e33e2917a

[18:47:49] <GothAlice> That is the simplest method to achieve what you want, a queryable "last" value from an array.

[18:48:13] <eaSy60> I implemented a virtual getter for the last value with my ODM (Mongoose).

[18:48:27] <GothAlice> Ah, Mongoose. How we love to hate that ODM. ;)

[18:48:41] <eaSy60> :-)

[18:48:43] <GothAlice> Yeah, with this discrete last value, you don't need a getter/setter any more.

[18:49:32] <eaSy60> Yes but I need to update it and I don't really want that, this would reduce the flexibility of the data

[18:49:44] <GothAlice> In what way would it do that?

[18:49:47] <GothAlice> It's still an atomic update.

[18:50:23] <eaSy60> Do you think I can achieve that with an aggregate?

[18:50:43] <GothAlice> …

[18:50:44] <GothAlice> This is an example of what I often refer to in MongoDB as non-duplication of data. It isn't technically a duplication of the data because it's pretty much required to query in the manner you wish.

[18:53:21] <GothAlice> Doing it the pure aggregate way would require a rather substantial $group projection that includes every field you care about, and the results come back in a manner different than standard queries. It would be more difficult to implement that way. Doable, but seriously, the $set when you $push makes everything extremely easy. You could limit the $set to a subset of the embedded document's fields if you really wanted, too.

[18:54:55] <GothAlice> (An aggregate would also require MongoDB to do a _lot_ more work.)

[18:55:07] <eaSy60> my array contains only : { status: String, date:Date.now }

[18:55:14] <eaSy60> I see

[18:55:33] <GothAlice> You could $set lastStatus if that's the only value you care about on that last record. :)

[18:56:02] <GothAlice> This is also referred to as a form of pre-aggregation. (You're keeping what would be the result of an aggregate query up-to-date in the record itself during normal use.)

[18:57:20] <eaSy60> Maybe I should do one request with $elemMatch, and then postprocessing the result in order to filter the documents where the last element of the array match my condition

[18:57:41] <GothAlice> … again, that's even more work. And work that would require roundtrips from the database to your application.

[18:59:50] <GothAlice> Schemas aren't sacrosanct… especially in MongoDB which is effectively schema-less. Your model should model how you need to query your data, you shouldn't be falling back on application-side bulk processing basically ever.

[18:59:55] <eaSy60> okay, I'll do that with myArray[] + myArrayLast

[19:00:13] <eaSy60> Hum

[19:00:35] <eaSy60> That's true

[19:00:38] <eaSy60> I agree

[19:00:50] <GothAlice> (Bulk processing of multiple records application-side eliminates the point of even having a database. Might as well use on-disk JSON files. ;) (Ohgodspleasenobodydothis…)

[19:12:08] <hayer> I store a time field as .Net DateTime for when the data was created/inserted. I use this to select all data between date X and Y. Can I use the ObjectId.getTimestamp for the same?

[19:14:42] <GothAlice> hayer: Yes. Most ObjectId implementations have a factory method "from_datetime" or similar to allow you to construct an ObjectId for a particular moment in time, and you can then $gt/$lt range query _id.

[19:15:34] <GothAlice> It's very useful, and for most cases (where you aren't using $hour/$year/etc. projection against the creation time field in aggregate queries) eliminates the need for a discrete creation time field.

[19:17:39] <hayer> but is the time from when the item was inserted or when it was actually written to disk?

[19:18:17] <GothAlice> In most cases the client driver is what constructs the ID, so it would be the moment of the .insert() call, not the time it was committed to disk. (The ID needs to already exist at that point.)

[19:18:40] <GothAlice> As an example, using MongoEngine in Python: ohist = Invoice.objects(criteria).filter(id__gt=ObjectId.from_datetime(now-timedelta(days=30)), state__ne='voi').count()

[19:19:24] <GothAlice> (Count the Invoice documents matching mostly security-related search criteria, i.e. ownership, whose ID indicates the invoice was created within the last 30 days and state isn't void.

[19:31:42] <Siamaster> I love mongodb

[19:31:47] <Siamaster> atleast so far..

[19:31:52] <cheeser> me, too!

[19:32:06] <GothAlice> Siamaster: I suspect you are unanimous. ;)

[19:32:54] <Siamaster> :D

[19:33:24] <Siamaster> i remember starting with neo4j. I was really excited in the beginning

[19:33:39] <Siamaster> but it disappointed me alot when actually using it

[19:33:40] <iksik> hmm, is it possible to execute mongo query directly from cli (with eval) without mongodb shell header/banner?

[19:33:50] <cheeser> iksik: -q

[19:33:58] <Siamaster> i'm not trying to compare the uses, but neo4j felt very flawed at the end

[19:34:11] <GothAlice> iksik: Indeed, "mongo --quiet somefile.js" should do you.

[19:34:28] <iksik> awesome, thank You ;)

[19:34:32] <Siamaster> and the pricing was on neo4j was very scary

[19:35:04] <Siamaster> without the ultimate edition, you wouldn't be able to use the high availability features :S

[19:35:38] <Siamaster> next time I need a graph database I'm gonna try out orientdb

[19:38:24] <StephenLynx> Siamaster just avoid becoming a fanboy. mongo is pretty good, but is not a panacea.

[19:39:03] <cheeser> i liked it so much i bought the company.

[19:39:11] <Siamaster> yeah, one need to be flexible in using stuff. nothing is always the right thing

[19:39:18] <Siamaster> hahahaha

[19:39:28] <salmaksim> I think we all get excited about databases, especially the good ones!

[19:39:50] <Siamaster> I'm actually switching in my project from using app engines datastore

[19:39:54] <Siamaster> i wasn't excited about it

[19:40:00] <Siamaster> and yet it disappointed me

[19:40:14] <StephenLynx> the problem with these engines is the coupling.

[19:40:18] <StephenLynx> that is plain bad design.

[19:40:33] <StephenLynx> you don't have just a db, you have THEIR db.

[19:41:08] <Siamaster> not only that, it makes you too depeneded on GAE and you need to always think about the pricing when designing your db

[19:41:17] <Siamaster> you have to anyway, but datastore was just too extreme

[19:41:36] <StephenLynx> yeah ,I heard google boned people pretty hard on disk space.

[19:42:21] <Siamaster> you can't query on field you don't index in datastore

[19:42:42] <Siamaster> and every index will cost and allocate as much space as the row itself

[19:42:58] <Siamaster> so if you have a many to many relationship like , User, Store, UserVisitStore

[19:43:23] <Siamaster> and you want to be able to query stuff like which stores does a user visit or which users are visiting a store

[19:43:32] <freeone3000> I'm getting "RS101 reached beginning of local oplog" when adding a new member to a replicaset after running rs.reconfig() on primary.

[19:43:33] <Siamaster> you will become bankrupted

[19:43:42] <StephenLynx> lol no wonder GAE never caught on

[19:44:05] <freeone3000> How can I add a new member to the replicaset with no data and have it just load?

[19:44:24] <Siamaster> i'm using it now, but I'm trying to design my project so that it's easy to migrate away if/when i have many users

[19:44:41] <Siamaster> it's cool for startups I think, just be careful

[19:48:59] <StephenLynx> yeah, being able to have some server side processing for free is good

[19:49:12] <StephenLynx> but it has so many limitations

[19:50:06] <StephenLynx> I would use it, I dunno, some proof of concept

[19:50:17] <StephenLynx> something wouldn't actually see real users.

[19:50:25] <StephenLynx> something that wouldn't*

[19:51:29] <fewknow> freeone3000: rs.add(host:port)

[19:51:41] <freeone3000> fewknow: Did that. That's how it got into FATAL.

[19:52:20] <fewknow> run rs.reconfig(conf) and remove it from the conf

[19:52:26] <fewknow> then delete all /data/db files

[19:52:33] <fewknow> attempt to add it again with rs.add()

[19:55:12] <ra21vi> Need help with following aggregation : https://gist.github.com/anonymous/78028b8b42ce2044f64b

[19:55:42] <fewknow> ra21vi: don't use mongo for aggregation ;)

[19:55:42] <freeone3000> fewknow: Same result.

[19:55:56] <fewknow> freeone3000: can you gist the log file?

[19:56:00] <ra21vi> fewknow: why so :)

[19:56:04] <fewknow> or the error you are getting

[19:56:37] <fewknow> ra21vi: because there are faster engines out there that are easier to use. Really depends on the aggregation, but complicated ones should be done in other engines

[19:56:51] <ra21vi> fewknow: just for single query, I won't opt to handle two DBs. I just moved my DB to mongo. and trying to map queries in mongo as much as possible

[19:56:59] <freeone3000> fewknow: https://gist.github.com/freeone3000/6c705b8979b0df843e54 is the logs. Error is "RS101 reached beginning of local oplog", which is the last message from "heartbeat" in rs.status()

[19:57:22] <ra21vi> fewknow: can you suggest some engines, I will read about them

[19:57:31] <fewknow> ra21vi: if you abstract away the db with a "data access layer" you don't need to know what DB the data comes from

[19:57:44] <fewknow> I use mongodb, elastic and hadoop all in same infrastructure

[19:58:12] <fewknow> ra21vi: memsql, mysql, spark, druid, apache drill

[19:58:44] <fewknow> ra21vi: really depends on what you are doing....i just don't use mongodb for aggregation(not fast enough for me)

[19:58:52] <ra21vi> fewknow: oh I see:) by maintaining all these just for a simple chat app, I will soon be begger :) LOL

[19:59:30] <fewknow> there are plugins for reading the oplog of mongo and writing to any data store

[19:59:38] <fewknow> so that is automagic ;)

[19:59:39] <ra21vi> i am using ElasticSearch

[19:59:45] <fewknow> nice

[20:00:35] <fewknow> freeone3000: those are all authentication errors?

[20:00:43] <freeone3000> fewknow: Yep.

[20:01:14] <fewknow> thought you were having a problem adding replica set member?

[20:01:49] <freeone3000> fewknow: Yes. It's in FATAL immediately after adding, with the complaint "RS101 reached beginning of local oplog" in rs.status().

[20:02:25] <fewknow> but there is nothing in the log files about it?

[20:03:18] <fewknow> can you past your rs.config() and rs.status()

[20:04:13] <freeone3000> fewknow: https://gist.github.com/freeone3000/6c705b8979b0df843e54

[20:04:51] <fewknow> okay..you need to set a priority on each node

[20:05:06] <fewknow> also you shouldn't ever have an even number of nodes

[20:05:49] <fewknow> also when you add a node to the set you can specify where it replicates from...you should probably do that.

[20:05:57] <fewknow> why are you adding so many secondaires?

[20:06:03] <freeone3000> fewknow: Geographic distribution.

[20:06:35] <fewknow> that will only work if you read from secondary(not recommended)

[20:06:50] <fewknow> you need to created a sharded cluster

[20:06:54] <freeone3000> fewknow: We know. We read secondary-preferred. There's actually another four secondaries coming up.

[20:07:07] <freeone3000> fewknow: We didn't want to do a sharded cluster because that ties user data to its geographic shard.

[20:07:18] <fewknow> only if you use that shard key

[20:07:29] <freeone3000> fewknow: If we don't use that shard key, it's not geographically distributed.

[20:07:44] <fewknow> okay

[20:07:51] <fewknow> why can't a user be tied to its locationi?

[20:07:56] <freeone3000> fewknow: Because users move.

[20:08:07] <fewknow> sure

[20:08:11] <fewknow> but not that often

[20:08:29] <fewknow> user may move

[20:08:31] <fewknow> not the rule

[20:08:39] <freeone3000> fewknow: It's for a company whose customers, do, indeed, move that often. We also have some very strict time requirements.

[20:08:57] <freeone3000> fewknow: Executives, salespeople, government ministers, etc.

[20:09:21] <fewknow> you can use a caching layer or something to solve that issue...if you want to distribute the data and scale it you will need to shard

[20:09:30] <fewknow> you are limited on how many secondaries you can have

[20:09:39] <fewknow> also the more secondaies you have the more stress ont he primary

[20:09:45] <fewknow> to replicate the data

[20:10:23] <fewknow> there is also a point where you will not be able to shard and you will be stuck with slow responses from secondaries

[20:11:26] <fewknow> but for the issues at hand you need to set priorities on all nodes...also make sure you always have odd number of nodes.

[20:11:34] <fewknow> and I would prime the new node with a set of data

[20:11:39] <fewknow> maybe a backup of the primary ?

[20:11:45] <fewknow> then add it to the replica set

[20:11:56] <freeone3000> fewknow: I set them as https://gist.github.com/freeone3000/50e21076527bd1808be9 .

[20:12:43] <fewknow> each one needs a priority.....and you never duplicate it...so 100,75,50,25, etc

[20:13:13] <fewknow> also isn't priority 0 a non voting node?

[20:13:18] <cheeser> yes

[20:13:19] <fewknow> I can't remember

[20:16:29] <freeone3000> Changed it to https://gist.github.com/freeone3000/50e21076527bd1808be9 with unique priorities. Have an odd number, since id 3 is a non-voting member. Still have id 7 in FATAL.

[20:19:46] <fewknow> freeone3000: just changing the priority is not going to fix the issue. Once you have the config correct you will need to remove the node and add it back. I would tail the log file when you add it back to see what is going on....tail the primary and the node you are adding

[20:22:52] <freeone3000> fewknow: Adding the node back on primary claims that the node is down.

[20:23:41] <freeone3000> fewknow: And now it's back into FATAL.

[20:31:00] <freeone3000> Any other suggestions? "RS101 reached beginning of local oplog" seems to mean that it's ahead of primary - how can I force it to understand that it's not?

[20:32:48] <fewknow> you can restore a backup and then add the node in.

[20:33:04] <fewknow> i am not sure why it is in FATAL...seems like a config issue where it doesn't know that it is a secondary

[20:40:02] <freeone3000> fewknow: Can't recover from backup - next backup window isn't available yet, and this replset had ran out of disk at last backup.

[20:40:21] <freeone3000> (Trying to restore those backups got a secondary stuck in RECOVERY with a last optime date of last friday, with no updates.)

[20:48:15] <hayer> How do I specify sort order in c# when using BsonDocuments and the FindAsync?

[20:48:34] <hayer> I've looked at the FindOptions but can't get it set up properly.

[21:59:38] <jordanpg> hello

[22:00:21] <jordanpg> i have a quick question about the java client

[22:00:45] <jordanpg> i see that org.mongodb.DB#addUser is deprecated in favor of #command

[22:01:19] <jordanpg> #addUser correctly adds a user, however I cannot get this to work with #command

[22:01:24] <jordanpg> <code>

[22:01:33] <jordanpg> DB db = mongoClient.getDB("admin");

[22:01:47] <jordanpg> db.command("db.createUser( { user: \"" + DatastoreFactory.DEFAULT_MONGO_DB_USERNAME + "\", pwd: \"" + DatastoreFactory.DEFAULT_MONGO_DB_PASSWORD + "\", roles: [] } )");

[22:02:02] <jordanpg> </code>

[22:02:30] <jordanpg> results in the following response: { "serverUsed" : "localhost:27017" , "ok" : 0.0 , "errmsg" : "no such cmd: db.createUser( { user: \"my_username\", pwd: \"my_password\", roles: [] } )" , "code" : 59 , "bad cmd" : { "db.createUser( { user: \"my_username\", pwd: \"my_password\", roles: [] } )" : true}}

[22:02:40] <jordanpg> what's the correct syntax for #command?

[22:20:25] <jordanpg> added question to SO: http://stackoverflow.com/questions/28707415/what-is-the-correct-syntax-for-the-dbcommand-method-that-is-replacing-the-conve

[23:59:59] <ra21vi> is mongo suggested as spatial geo db with 10% write and 90% read load scenario (geo queries mostly in radius search)...

Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 24th of February, 2015