[00:03:17] <cipher__> I started mongos (numactl --interleave=all mongos --fork --nohttpinterface --logpath /var/log/mongodb_router.log --logappend --configdb 192.168.100.81:27019,192.168.100.82:27019,192.168.100.83:27019), it ran fine, and i ran mongo localhost:27017/admin
[10:10:29] <shoshy> hello, i'm using mongodb 2.4 on ec2, i've ran "db.groups.update({},{$set:{loc: { type: [Number], index: '2dsphere’}},{multi:true})" on server. Since then , when i run db.groups.find() i get "SyntaxError: Unexpected identifier,Error: 16722 SyntaxError: Unexpected identifier" and if i try to use mongohub only on that location it crashes.
[10:12:34] <Nodex> 2dspehre expects an array does it not?
[10:13:25] <shoshy> i just wanted to add an empty GEOJSON property to the existing schema items
[10:13:37] <shoshy> and the answer is yes, it does..
[10:14:20] <shoshy> it's just that i'm new to mongo and to it's GEOJSON properties... how can i add an empty coordinate field that is GEOJSON compliant
[10:14:50] <shoshy> {loc: { type: [43.32,56.777], index: '2dsphere’}} for example?
[10:15:04] <Nodex> if your document doens't have one then it won't be indexed
[10:15:27] <Nodex> not every key has to have a value
[10:16:01] <shoshy> ok, and if i wanted to add value... a point ? like i wrote?
[10:28:43] <Nodex> the official driver is Mongodb Native
[10:29:30] <Nodex> https://github.com/mongodb/node-mongodb-native <--- official
[10:29:36] <shoshy> you're right... they do mention on the mongodb site that: "Mongoose is the officially supported ODM for Node.js. It has a thriving open source community and includes advanced schema-based features such as async validation, casting, object life-cycle management, pseudo-joins, and rich query builder support."
[10:30:19] <shoshy> sorry, i'm new to this, i just started using it 2 days ago... and using an ODM , which seems very up to date, seemed the logical thing to do
[10:30:56] <shoshy> so i dont know what they mean by "officialy supported ODM for node.js" if it's not..
[10:31:20] <shoshy> as for mongodb native they write the same "The MongoDB Node.js driver is the officially supported node.js driver for MongoDB"
[10:31:37] <Nodex> it has pros and cons, it's recommended to learn mongodb types and functionality first befire having it all stripped away from you in an abstraction layer
[10:34:07] <shoshy> ok... thanks! you got me thinking more about this
[10:34:28] <shoshy> it made things work faster for me (ODM)
[10:41:38] <shoshy> you're probably right... i should switch to the native driver
[10:45:56] <romaric> Hi everybody, we are experiencing some issue on our sharded system. We are sending a batch insert to our mongos, which will well split data between our two shards, but it seems that the two write operations are done sequentially. Someone have any idea. You can see mongos very verbose log here : http://pastebin.com/H0K05kwq
[13:15:08] <eagen> romaric Based on http://docs.mongodb.org/manual/faq/concurrency/ your inserts should be able to be done to each shard at the same time. Interesting.
[13:26:32] <eagen> romaric According to http://docs.mongodb.org/manual/core/bulk-inserts they recommend inserting to multiple mongos instances to parallelize imports. This blog has some ideas as well: http://cfc.kizzx2.com/index.php/slow-batch-insert-with-mongodb-sharding-and-how-i-debugged-it/
[13:41:11] <remonvv> eagen : Not sure who wrote that but multiple mongos instances won't do all that much for insert performance.
[13:59:48] <romaric> thank you eagen, but it is the job of the mongos to route the docs to their shard where it is done at the same time, isn't it ?
[14:01:30] <remonvv> romaric: What is it you're trying to do exactly? Bulk insert and you don't see full utilization of all the shards?
[14:07:03] <romaric> hi remonvv, I'm doing bulk insert and I do not see any improvement on the write speed. What is weird is that we can see the two shard filling up with approximately half of the list of document (20k batch insert, ~10k on each shard at the end) but it seems that the two background 10k batch insert, sent by the mongos, are done sequentially
[14:07:13] <eagen> romaric I totally agree. Mongos _should_ handle it but apparently it doesn't.
[14:07:19] <romaric> according to this logs :http://pastebin.com/H0K05kwq
[14:07:31] <romaric> l.03: mongos is going to insert approximately the half of the original 20k docs batch insert on shard 1
[14:07:37] <romaric> l.07: mongos has wait for writebacks on shard1
[14:07:42] <romaric> l.10 : mongos is going to insert the other half of the original 20k docs batch insert on shard2
[14:08:17] <romaric> before to say that mongo does not do something I want to try & I hope that the issue is me instead of mongo ;)
[14:11:01] <remonvv> romaric: Is the index hashed?
[14:11:21] <romaric> remonvv: this is not ObjectId, this is a binary id
[14:11:42] <remonvv> romaric: Okay, how big and does it increase over time?
[14:12:23] <remonvv> romaric: It sounds like you have hotspots in your chunk distribution (meaning; an above average amount of documents go to the same shard)
[14:12:59] <remonvv> romaric: Either way, unless your mongos is cpu or I/O bound there's no real value in having more of them. Not sure why it's suggested in the documentation.
[14:13:45] <romaric> hummm that what every one say but we did a 20k bulk insert & it results in two shard containing approximately 10k docs each
[14:15:26] <rspijker> I think it’s just batch insert that doesn’t play nicely with sharding
[14:16:24] <romaric> the shard key is made of 14 bytes with random inside that result into two 10k shards
[14:16:54] <remonvv> romaric: The end result is not that important. If you have a bad shard key it might do the 10k on 1 first and the other hal on 2 sequentially. And there are various more subtle problems that can occur.
[14:19:32] <remonvv> Say you have 2 shards with 1 chunk each and your shard key is a date. It will fill up 1 shard with the earlier half of the date values and the other half with the other ones (assuming the chunk split happened exactly in the middle of the date range).
[14:20:56] <remonvv> Basically you want your shard key values to be uniformly distributed across the value range and "random"
[14:21:28] <remonvv> "random" as in your insert order should not be ht order of hash key values
[14:22:04] <rspijker> romaric: didn’t you tell me yesterday that you had a time component in the ids but you shufled them somehow so that they weren’t at the end?
[14:22:05] <romaric> I'm not sure to understand this "random"
[14:22:22] <rspijker> because, looking at your sh.status, the end of everything seems to be exactly the same...
[14:23:30] <rspijker> you will still have monotonically increasing ids
[14:23:32] <remonvv> romaric: Think of it like this; say you have two shards with one chunk each; the first chunk takes shard key values 0-9 and the other 10-19, if you insert 0,1,2,3,4,5,6 they'll all end up in shard 1 and shard 2 will be idle
[14:23:42] <remonvv> I'm not sure if I can ELI5 it more than that.
[14:24:33] <romaric> remonvv : ok, what happend if a bulk insert {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19} ?
[14:24:50] <rspijker> so, the xxxx part is actually random? Or, more importantly, if I take two documents out of your sequence will the xxx of the first always be smaller than that of the second?
[14:29:46] <rspijker> romaric: it’s like I said yesterday. All of the reads will always go into your maxchunk
[14:30:06] <saml> i could change the query to: db.docs.find({date:{$gt:ISODate('1900-01-01')}, count:{$gt:0}, status:'published'}).sort({date:-1, count:-1})
[14:30:20] <remonvv> romaric : Say you have 20 people in front of you. 10 girls and 10 guys. On your left you have a pink car for the girls and on the right you have a blue car for the guys. Your goal is to get all 20 people in their car as quickly as possible.
[14:30:29] <saml> db.docs.ensureIndex({date:-1,count:-1:status:1}) after this, that query is fast
[14:30:34] <remonvv> What you're doing is put the 10 girls in line first and then the 10 guys.
[14:30:45] <Nodex> you could try a compound on date:-1, count:-1, status:1
[14:30:46] <saml> but i just blindly ensure index by using all fields appearing in the query and sort
[14:30:57] <saml> is there more systematic way of approaching index set up?
[14:31:10] <Nodex> not suew how confident the planner will pick without being on 2.6
[14:31:13] <saml> yah that's what i did. but that's just a hunch
[14:31:33] <saml> planner picks that index (date:-1,count:-1,status:1)
[14:31:44] <remonvv> romaric : Make sense? You want the shuffle the guys and the girls so they can get in their cars at the same time rather than the guys having to wait for all girls to get in their cars.
[14:31:53] <romaric> ok remonvv, I really understand this, let me explain why i still think the shardkey is good
[14:31:57] <saml> so do i set one index per query?
[14:31:57] <remonvv> I get the feeling I'm overdoing it on the analogies here..
[14:36:36] <rspijker> romaric: I gave it some more thought and if you are doing it exactly like you’re saying you are. Then your keys won;t be monotonically increasing
[14:36:39] <romaric> remonvv : do you think I'm crazy ;) ?
[14:37:32] <remonvv> romaric : I don't think my assessment on your mental health will move this forward
[14:39:03] <remonvv> romaric : If you pick any random 2 documents from your batch, what are the odds of one going to shard 1 and the other going to shard 2
[14:40:05] <romaric> the odds of doc 1 going to shard 1 is 1 out of 2 and the same for doc 2
[14:40:23] <remonvv> rspijker: Not necessarily, depends on how his batches are ordered. He's basically doing the equivalent of turning YYYY:MM:DD hh:mm:ss into hh:YYYY:MM:DD mm:ss if I understand correctly.
[14:40:49] <remonvv> I might not have understood correctly.
[15:00:45] <romaric> the next step in our project is to upgrade to 2.6, maybe it's gonna be the actual setp ;)
[15:01:55] <rspijker> 2.6 upgrade will likely resolve it for you
[15:02:04] <rspijker> untill then, just loop through the inserts in your app
[15:02:12] <rspijker> see how that performs in sharded vs single mongod
[15:03:21] <remonvv> Hm, this mongostat dump looks exactly like a sequential insert. Unless 2.6 does black magic I don't think that will resolve this.
[15:04:52] <rspijker> well…. if the mongos is currently translating bulk inserts into separate inserts, very badly, some simple logic could vastly improve that
[15:12:14] <q85> Questions for you ladies and gents. Why does the primary drop connections when I apply a new config to hide a secondary? Does anyone know of an elegant way take secondaries down without interrupting the primary?
[15:18:58] <romaric> I sort the list of document before inserting them because without doing that, the mongos result by doing a small batch insert on shard 1, then on shard2, then on shard1 etc...
[15:19:38] <romaric> but the result by sorting the batch insert sort the girls & the boys
[15:21:10] <remonvv> romaric : If you sort it you're basically creating the worst case scenario
[15:23:26] <remonvv> romaric : Shuffle your batch and do single inserts. See what happens.
[15:30:21] <romaric> well we will probably upgrade for 2.6 and see
[15:31:18] <remonvv> I suspect that will solve exactly nothing in your specific test but keep us posted ;)
[15:33:40] <romaric> I'll keep you poster ! Do you have any other idea than upgrade remonvv ? Let's say that the single inserts are faster than the bulk operation in sharded system (already weird to write this), what does it mean ?
[15:36:26] <remonvv> romaric : It would mean that the single inserts can be applied in parallel whereas your bulk inserts can do so to a lesser extent. Bulk inserts are not a magic bullet. They're a bit faster...most of the time...
[15:53:02] <joshua> wawaweewa whats going on with MMS
[16:34:36] <jamis> is it possible to do an upsert and $push on an array that is deeper than 1 field? e.g: db.collection.update(query, { $set: {extracted: {hello: {world}}}, $push: {extracted: {array_field: {val:1}}} }, upsert=True)
[16:35:51] <jamis> in this case, array_field is inside the extracted dict. I need to create or update the extracted dict and push values onto extracted.array_field.
[16:36:35] <joshua> Have you tried using dot notation?
[16:37:19] <jamis> joshua I tried doing {$push: {'extracted.array_field': {val:1} }} using pymongo and didn't have luck
[16:39:07] <jamis> @joshua error when trying dot notation: "pymongo.errors.OperationFailure: Cannot update 'extracted' and 'extracted.array_field' at the same time"
[16:39:08] <joshua> Hmm. I thought it worked on the shell anyway. Never tried with pymongo (Or if I have, I don't remember what I ended up doing)
[16:43:16] <joshua> That error sounds like it might work, just not both at once so maybe you have to do them one by one.
[16:45:39] <jamis> So do the upsert with the @set first and then do another update with dot notation on the nested extracted.array_field?
[16:49:14] <blahsphemer> I am able to connect to my mongodb using hostname as localhost, but if I specify my own IP address, I can't
[16:50:20] <joshua> jamis: I'm probably not the best person to answer, but thats what I was thinking. You can iterate through them and perform two actions
[16:51:09] <joshua> blahsphemer: There is a --host and --port flag. By default it uses localhost anyway
[16:52:56] <blahsphemer> joshua, I found bind_ip in /etc/mongod.conf,
[17:43:48] <cofeineSunshine> ERROR: error creating index when cloning spec: { name: "query_1_$snapshot_1", key: { query: 1.0, $snapshot: 1.0 }, ns: "bidgeon_prod_db.task_actions", background: true } error: CannotCreateIndex bad index key pattern { query: 1.0, $snapshot: 1.0 }: Index key contains an illegal field name: field name starts with '$'.
[17:44:00] <cofeineSunshine> there is no such index on primary replica set instance
[17:44:12] <cofeineSunshine> but it gets this error
[17:44:36] <cofeineSunshine> removed this index from primary
[17:44:54] <cofeineSunshine> but replica hits this error and restarts from beggining
[17:50:58] <Kaiju> cofeineSunshine: Followed this yet? http://docs.mongodb.org/manual/release-notes/2.6-upgrade/
[17:52:34] <cofeineSunshine> Kaiju: yes. Beed following. did db.upgradeCheckAllDBs() check. It told me about those $snapshot indexes
[17:54:13] <prateekp> how can i have "has_many" folders as its option
[17:55:03] <cofeineSunshine> prateekp: list of ids, or DBRefs
[17:55:51] <Kaiju> cofeineSunshine: I messed up my migration pretty bad when I did it. Got stuck in a race condition that could not be resolved. I ended up doing a mongodump backup. Wiping everything. Strating up new 2.6's all the way around and reimporting the data.
[17:56:16] <Kaiju> 1 shard server, 3 shards and 2 replicas
[18:43:26] <joshua> cozby: if you stop both, then start one up that has your data, then start the second one with the data removed it should sync back up
[18:43:59] <joshua> But you really should add a 3rd in, even if its just an arbiter that doesn't hold data
[18:44:52] <joshua> And if all else fails you can start a node up as a single db without replication enabled and you can get your data OR do a dump from the data directory with the server stopped
[19:26:58] <Zelest> ts33kr, fix your shit please. :(
[19:30:49] <kali> cozby: you must reconfigure with the force option on the remaining secondary, and discard the broken replica
[19:31:11] <kali> cozby: and then, make sure you setup an arbiter. a replica set with two nodes is a disaster waiting to happen
[19:31:38] <kali> cozby: well, yours has actully happen
[19:32:08] <joshua> Its not a disaster its a learning experience :)
[20:09:36] <michaelchum> I would like to aggregate for documents that values for a key are not something OR not something else such as [ { $match: { $or [{ "mykey": { $ne: "myvalue1" } }, { "mykey": { $ne: "myvalue2" } } ] } } ]
[20:10:22] <michaelchum> But it gives me some errors: "Command failed"
[20:11:05] <michaelchum> Any ideas how I can do NOT MATCH something or something with aggregation?
[21:56:43] <russql> i guess what' im asking for is not possible
[22:07:00] <djlee> Hi all, anyone here using mongohq ?
[22:08:09] <rafaelhbarros> no, everybody here uses mysql
[22:08:23] <rafaelhbarros> if you can't use mysql, go for sqlite3
[22:08:39] <rafaelhbarros> djlee: if you need anything, shoot, people are very active here
[22:08:59] <djlee> rafaelhbarros: im going to assume that was your attempt at sarcasm, but can i just point out i said mongohq not mongodb, im not that stupid :P
[22:09:24] <rafaelhbarros> djlee: yes, I'm the stupid one with lysdexia
[22:09:35] <rafaelhbarros> djlee: I have one free account there
[22:09:50] <rafaelhbarros> djlee: it works fine, I don't do any heavy lifting
[22:09:58] <rafaelhbarros> I mostly do prototyping
[22:10:00] <djlee> i have a specific issue with mongohq where its taking a LONG time to open a connection, and i mean its gone from near instant on a local mongo install, to up to 1 second on mongohq
[22:11:32] <djlee> rafaelhbarros: its PHP, abstracted by a few libraries (so theres a 3rd party lib that manages creating the connection, which uses the php mongo driver)
[22:14:19] <djlee> rafaelhbarros: no worries, can you tell me (im new to replicasets) every time i create a connection, should i be connecting to the replicaset uri? or should my applicaion do some caching? Not sure if its because im using a replicaset uri that its got slow, or if its network related
[22:14:50] <rafaelhbarros> my python code uses pymongo
[22:15:30] <rafaelhbarros> and it's MongoClient("mongo-server-01", replicaSet='foo')
[22:15:47] <rafaelhbarros> so, I basically put one of the replicas, it figures out which one is which
[22:17:41] <djlee> cheers rafaelhbarros, only difference i can see is i provide multiple members of the replicaset (as advised by mongohq), so maybe i'll just try the one member and see what happens, just to start ruling stuff out
[22:18:10] <rafaelhbarros> djlee: alright, let's see where it takes you
[22:19:09] <russql> is there a way to unwind multiple fields?
[22:28:30] <cofeineSunshine> now I have 2 secondary servers killing themselves by trying to apply taht operation. And in those server oplog collections getting bigger
[22:29:08] <cofeineSunshine> As I understand SECONDARY instances reading from PRIMARY local.oplog.rs collection and aplying those operations to itself, yes?
[22:37:30] <joannac> I'm confused. why can't your secondaries apply the operation?
[22:39:28] <djlee> hey rafaelhbarros, this is going to sound silly, but i got the replica set name out of mongohq by digging through a bunch of debug info in mongohq, which sounds like a lot of hassle for information you actually need to work with it. By any chance should i be using a more generic replica set name such as the host name or database name? At the moment i have it set to "set-<random string>" which i found in mongohq s
[22:46:01] <rafaelhbarros> djlee: one second let me read that
[22:46:38] <cofeineSunshine> joannac: huh, I have here situation. I was migrating my replica set from 2.4 to 2.6.2. There were some indexes with fieldname starting '$'
[22:47:47] <cofeineSunshine> it several times was in STARTUP2 mode, and restarted from start when encountred faulty index(cleaned db an started from beginning). At some point of time, at primary instance I deleted that shitty index.
[22:48:12] <rafaelhbarros> djlee: well, with mongohq I'm not sure how you name your replicaset, you should have that in a setting somewhere
[22:48:22] <cofeineSunshine> Now i have situation when 2 SECONDARY instances trying to aply removal operation of that foulty index
[22:48:24] <rafaelhbarros> djlee: and yes, it's a hassle to have to dig through logs for something you need
[22:49:25] <djlee> rafaelhbarros: yeah i was just wondering whether i had the set name wrong or was doing something silly. I may just drop tech support a note, i'm almost certain its something im misunderstanding
[22:49:57] <rafaelhbarros> djlee: also, in regards to naming: samething as any variable ever written: MexicanFoodReplicaSet
[22:53:18] <cofeineSunshine> joannac: Now it runs forever*, trying to apply that operation and fails
[22:53:36] <cofeineSunshine> when I try to stepDown primary, other server crashes