pmxbot IRC Log Viewer

[00:23:29] <quuxman> I'm struggling with getting my queries to use indexes at the moment

[00:24:44] <quuxman> I'm also having issues with a recent upgrade from 2.4 to 2.6, and failing to build indexes at all. There's the failIndexKeyTooLong command flag, but they don't expose that (MongoHQ / Compose.io). Is there a way to set it from the Mongo client?

[00:26:55] <joannac> runCommand( { setParameter: 1, failIndexKeyTooLong: false } )

[00:27:05] <quuxman> joannac: thanks!

[00:27:30] <quuxman> now I just have to figure out how to make the mongo client connect. For some reason using the exact same connection string I'm using in pymongo fails

[00:29:51] <quuxman> can the mongo client connect to a replica set? My connection string has multiple servers

[00:30:22] <joannac> yes

[00:30:28] <joannac> wait

[00:30:32] <joannac> the "mongo client"

[00:30:35] <joannac> you mean the shell?

[00:31:19] <quuxman> I get "Fri Oct 31 00:30:31 Assertion failure _setName.size() client/../db/../client/dbclient.h 156"

[00:31:26] <quuxman> yeah

[00:31:32] <joannac> what are you typing?

[00:32:38] <quuxman> mongo mongodb://live:XXXXXXX@candidate.XX.mongolayer.com:XXXX,candidate.XX.mongolayer.com:XXXXX/XXXXXXXXX # obviously sensored

[00:33:39] <joannac> doesn't work like that

[00:33:51] <joannac> connect to a single host

[00:34:31] <quuxman> presumably the master? Do I have to execute the runCommand on both, I presume?

[00:34:47] <joannac> yes

[00:34:55] <joannac> you should be able to do it on all nodes

[00:35:13] <quuxman> well I can't connect to an individual node. If I remove one of the servers, it says "bad connection string"

[00:35:35] <joannac> actually, my bad

[00:35:36] <joannac> http://docs.mongodb.org/manual/reference/program/mongo/#cmdoption--host

[00:36:02] <quuxman> er, sorry "exception: HostAndPort: bad config string"

[00:39:47] <quuxman> AH ok, I think I've run into this before, that for some ridiculous reason the mongo shell client doesn't use standard connection strings at all

[00:40:03] <quuxman> I imagine I can use runCommand with pymongo

[00:40:58] <quuxman> should've done that to start with

[00:43:03] <quuxman> I managed to authenticate with the mongo shell, and now I see: "errmsg" : "setParameter may only be run against the admin database.", but I think I can figure that out ;-)

[00:43:32] <quuxman> ah shit, I'm not authorized

[00:55:58] <Terabyte> hey, I have a document that exists, and I want to add to a subportion of it (in this example, I want to add another sub element to the array of subValues belonging to itemid 4 belonging to myId 1. Still trying to get relational world out of my head, what's the "doc" way of doing this? https://gist.github.com/danielburrell/8249094d5008c9f62fe1

[00:57:58] <Terabyte> A dumb way is to select the document I want, turn it to a pojo, add the item to my pojo, and write the document back (overwriting the old/deleting the original). but I'm fairly sure update() exists for a reason...

[00:58:10] <joannac> $Push

[00:59:12] <Terabyte> thanks i'll take a look at that

[01:16:07] <quuxman> When you create an index, the keys are specified in a dict, so I assume there is no order enforced to the index?

[01:16:52] <quuxman> (like in mysql (at least when I used it), where different queries can use an index (foo, bar) than (bar, foo)?

[01:28:12] <lpghatguy> Is there a better way to force a case-insensitive unique index than storing a duplicate normalized case version with its own unique index?

[01:28:23] <lpghatguy> err, that's probably more of a Mongoose-specific question

[01:32:35] <GothAlice> quuxman: No, BSON dictionaries are ordered. Unfortunately not all languages support this feature, so some drivers, like pymongo, have custom datatypes to handle it.

[01:33:54] <GothAlice> quuxman: This means that an index on {company: 1, date: -1} will only optimize queries on (company) and (company, date). Luckily the query optimizer usually figures out to re-arrange incoming queries (if possible), but sometimes you need to give MongoDB a hint. Thus: http://docs.mongodb.org/manual/reference/operator/meta/hint/

[01:35:48] <GothAlice> (There are some nifty optimizations around having frequently accessed data accessible entirely within index data. In an $explain you can see if you're getting this optimization when indexOnly is true.)

[01:39:08] <hicker> Hi everyone, can anyone tell me why I get { __v: 0, _id: 5452dc48d687bad849d70816 } after executing create()? http://pastie.org/9686889

[01:40:31] <quuxman> GothAlice: ah, crazy

[01:40:36] <quuxman> I should know this already

[01:40:45] <GothAlice> :) No worries.

[01:43:12] <GothAlice> hicker: __v looks like a version number, and drivers may return the inserted document's ID upon insert. Unfortunately the documentation for mongoose is bad and should feel bad for not specifying the return value more explicitly than just "<Promise>".

[01:46:44] <hicker> Oh, hmm. I was expecting there to be a customerName node in the document after executing Customer.create({ customerName: 'John' }, ... Do I need to execute something after create()?

[01:47:02] <GothAlice> Likely a findOne.

[01:47:29] <GothAlice> Or you could store the document you are passing to insert in a variable, and return it updated with the returned values…

[01:47:55] <GothAlice> insert/create

[01:48:10] <hicker> But { customerName: 'John' } doesn't get stored in the document at all

[01:49:09] <GothAlice> So you were getting that { __v: 0, … } document as a result of a find after this web page hook?

[01:50:33] <hicker> That's what I'm getting when viewing the collection in Robomongo... I can try finding it using the console

[01:50:47] <GothAlice> Please do.

[01:51:06] <GothAlice> The interactive shell is where you diagnose problems.

[01:52:10] <GothAlice> Unfortunately, if that is in fact the case (the data isn't getting written), I'm at a loss. I don't Mongoose.

[01:54:39] <hicker> Ohk, well thank you for your help! I'll keep searching

[02:43:32] <Terabyte> hey

[02:44:22] <rkgarcia> hey Terabyte

[02:44:23] <Terabyte> I wanted to generate a unique id for a sub component of my document (what used to be autoincrement in relational world), it doesn't have to be a number, but it does have to be a scalar value. Right now I'm looking at the output, and I'm seeing a details json object with timestamps etc. is there anyway to get a single value?

[02:44:51] <Terabyte> I'm using new ObjectId() to generate the ID at the moment (org.bson)

[02:45:20] <rkgarcia> Terabyte, that's right, use the ObjectId for generate new ID's

[02:45:51] <Terabyte> right, but i'd like my ids to be a number, id:1, id:2..3..4. not id{metadata:12387123, timestamp:129724, version:923928} etc

[02:46:27] <GothAlice> Terabyte: How do you plan on keeping the numbers sequential? Are duplicates allowed?

[02:46:29] <rkgarcia> Terabyte, then you need to use a counter collection

[02:46:57] <Terabyte> GothAlice duplicates are not allowed,

[02:47:00] <GothAlice> Terabyte: Counters like that add an extra layer of locking (to be safe), and are such a problem Twitter created an entire service, separately scalable from the database, just to handle it.

[02:47:21] <GothAlice> (And are a big reason why MongoDB uses a better structure.)

[02:47:54] <GothAlice> https://github.com/bravecollective/forums/blob/develop/brave/forums/component/thread/model.py#L100-L110 < Even I generate ObjectIds for subdocuments; replies to a thread on a forum, in this case. The IDs are used frequently to look up specific comments, edit them, etc.

[02:48:20] <Terabyte> hmmmm

[02:48:33] <Terabyte> so you're saying that you submit the ID as a json object when you want to edit something?

[02:49:14] <Terabyte> ok i'm gonna have a thing

[02:49:15] <Terabyte> thanks

[02:49:31] <Terabyte> if people use that meta data when making changes then that's ok

[02:49:51] <GothAlice> https://github.com/bravecollective/forums/blob/develop/brave/forums/component/thread/model.py#L135-L161 < This does some argument list mangling, but yes, it boils down to finding the thread containing the reply (comment) with the appropriate ID, then updating a la "comments.$.foo"

[02:51:31] <Terabyte> ok

[02:51:57] <GothAlice> Terabyte: http://cl.ly/image/1U3z3h1q0r2T < When I emit HTML, I include important details in a way that is easily accessible from JavaScript. :)

[02:52:23] <GothAlice> (Thus most of my JavaScript is completely generic throughout my front-end.)

[02:55:48] <raar> Hey guys, I'm getting a bunch of errors in my web app like "71: Failed to connect to: mongo-example:27017: Remote server has closed the connection in /var/www/example.php on line X".

[02:55:58] <raar> this only happens sometimes, most connections are fine

[02:56:03] <raar> How do I troubleshoot this?

[02:56:26] <raar> I'm not seeing many slow queries, the disk i/o should be sufficient, I don't really know what the cause could be. I'm using mongodb 2.6.5

[02:56:36] <raar> my ulimit (hard and soft limits) is 64000

[02:56:50] <raar> I get between 0 and 500 update queries per second.. (and between 0 and 20 inserts)

[02:57:15] <raar> I normally don't get more than 35 simultaneous connections (have 51168 available)

[02:57:35] <raar> I don't really know how to troubleshoot it.. any ideas/suggestions?

[02:58:23] <raar> (also, it seems worse after restarting mongod.. more application errors, but I only _think_ this is the case [don't want to restart mongo again to see if I get more problems in production])

[02:58:43] <raar> I have 500 provisioned iops (aws volume) on an ssd (io1)

[03:58:46] <Terabyte> GothAlice see I'm having trouble now deserializing...

[03:59:05] <Terabyte> web browser sends the ID of the document, but there's too many fields for the constructor of ObjectId

[03:59:36] <Terabyte> so the json parser is having issues, at first i thought it was a naming convention issue, but then i look at the constructor, and there's only 3 values, vs the id which contains 7

[04:27:17] <Ceseuron> Hi gang

[04:30:35] <joannac> hi

[04:30:59] <Ceseuron> Is anyone in here versed with MongoDB and handling duplicate key index errors?

[04:38:44] <joannac> yes

[04:39:06] <joannac> Ceseuron: I suggest you ask your actual question instead of asking meta-questions

[04:39:30] <Ceseuron> I was trying to avoid being rude and just filling up the channel. Sorry.

[04:39:48] <Ceseuron> Basically it goes like this...we're draining replicasets to migrate to new physical MongoDB servers

[04:40:09] <Ceseuron> The process is getting hung up on NUMEROUS duplicate key index errors

[04:40:40] <Ceseuron> And I basically have to sit here tailing the MongoDB log. When the duplicate error comes up, I have to do a manual removal of the duplicate document

[04:41:03] <Ceseuron> I'd like to know if there's a way I can scan the collection prior to shard drain for duplicate key indexes.

[04:41:16] <joannac> um, what?

[04:41:34] <joannac> you are draining a replica set with indexes

[04:41:44] <joannac> you insert all of the data into a new server with the same indexes

[04:41:52] <joannac> and you get duplicate key exceptions?

[04:42:00] <joannac> where are your duplicates coming from?

[04:42:26] <Ceseuron> errMsg: E11000 duplicate key error index

[04:43:16] <Ceseuron> The duplicates are the result of a long since past incident that never got corrected.

[04:43:37] <joannac> why don't you change your code to do upserts instead?

[04:44:29] <Ceseuron> I am not a developer. This is not my design. I'm the engineer responsible for draining these replicasets so the VMs that were running them can be removed and the entire collection consolidated down to physical systems.

[04:45:40] <Ceseuron> It all started out innocently enough, from what I understand. MongoDB on a few VMs to test feasability before production use.

[04:45:54] <Ceseuron> That turned into eighty virtual machines.

[04:46:12] <Ceseuron> forty shards in VMware.

[04:46:43] <Ceseuron> And a situation at one point that resulted in more than one "primary" in a shard writing data to the collection.

[04:46:43] <joannac> are you using mongodump/mongorestore?

[04:46:48] <Ceseuron> no

[04:46:49] <joannac> hahaha

[04:47:00] <joannac> mate, you're screwed

[04:47:19] <Ceseuron> Normally I'd tend to agree with you

[04:47:37] <Ceseuron> Except I've drained 37 of 40 shards now.

[04:48:24] <Ceseuron> I'd just like to know if there's a way I can handle the duplicates automatically.

[04:48:36] <joannac> depends what you're doing

[04:48:43] <joannac> I would just modify the code to upsert

[04:48:52] <Ceseuron> Yeah our developers aren't that "responsive"

[04:49:08] <Ceseuron> Here's what I'm doing now.

[04:49:38] <Ceseuron> I run out a couple of SSH sessions onto the replicaset's primary member. On one, I tail the mongodb log in realtime (-f) and grep for E11000

[04:50:13] <Ceseuron> When one comes up, I use the mongod shell on the other to do a db.members.app.remove({blablablabla})

[04:50:22] <Ceseuron> And the drain continues until the next error comes up

[04:50:33] <joannac> the drain continues?

[04:50:36] <Ceseuron> If the duplicate is not handled...well you can pretty much guess what happens.

[04:50:55] <Ceseuron> This process has worked for 37 out of the 40 shards in VMware that I've been draining.

[04:51:01] <joannac> you and I are having a terminology problem

[04:51:06] <Ceseuron> Perhaps.

[04:51:14] <joannac> "Drain" to me means "get a document out of the database"

[04:51:21] <joannac> what does "drain" mean to you

[04:51:35] <Ceseuron> move the chunks off the replicaset.

[04:51:44] <joannac> that doesn't mean anything. How?

[04:52:03] <Ceseuron> We issue rs.removeShard from the MongoS shell.

[04:52:11] <joannac> sh.removeShard()

[04:52:39] <Ceseuron> db.runCommand({removeShard:"shardname"})

[04:53:22] <joannac> how do you decide which one should win?

[04:53:41] <joannac> is it arbitrary?

[04:54:49] <Ceseuron> Hmm...So when we had 40 shards going on 80 VMs, before I was actually working here, they added 4 shards running on some beefy servers with a hefty amount of storage and RAM.

[04:55:00] <Ceseuron> Then when I started, they handed the project to me.

[04:55:21] <Ceseuron> I really hate to say it, sir, but you probably know more than I do about Mongo at this point.

[04:56:24] <Ceseuron> I know enough to know that the duplicates have to be removed, otherwise Mongo simply fills up the disk space on the shard being moved endlessly retrying the move on a duplicate document.

[04:57:05] <Ceseuron> I believe the consensus from the data team and the devs was "Mongo will simply move data to remaining shards until only the physical shards are left as destinations".

[04:57:32] <Ceseuron> If there was a way to define "who wins", you'll be the first person I've talked to that knows it.

[04:58:14] <joannac> no, I mean

[04:58:20] <joannac> you have shards A B and C

[04:58:25] <joannac> you remove shard C

[04:58:35] <joannac> shard C's data drains to A and B

[04:58:44] <joannac> you get a duplicate key error

[04:58:59] <joannac> you remove the existing document on A or B so the drain from C can continue

[04:59:16] <Ceseuron> No, I remove it from C

[04:59:34] <joannac> Would you be just as happy removing it from A or B?

[05:00:01] <Ceseuron> I would be just as happy to make it so that the shard drain doesn't have me sitting here manually removing individual documents that could possibly number into the thouands.

[05:00:10] <joannac> drop all your unique indexes

[05:00:19] <joannac> ensure them again at the end, and specify dropDups

[05:00:38] <joannac> it will keep one document and drop all the rest

[05:00:49] <joannac> which is okay because you don't care which dupe gets kept

[05:00:59] <Ceseuron> the documents are identical, so yeah

[05:01:11] <Ceseuron> db.ensureIndex, I assume?

[05:01:37] <joannac> wait, hold on

[05:02:08] <Ceseuron> I would prefer, if possible, to get a list of documents that have duplicate keys in the collection if possible.

[05:02:17] <Ceseuron> And simply remove those en masse.

[05:03:00] <Ceseuron> this is a production system. It is literally in production now with active users on the application that depends on it. I will probably get shot or at least fired if I cause a service interrupt.

[05:03:30] <joannac> how would you propose to do that? How would you even tell if there was more than one document with the same set of keys?

[05:04:39] <Ceseuron> the "ownerid" should be unique for every document. Aggregate all documents in the collection and list out any that hit more than once for the same key.

[05:05:16] <Ceseuron> In short, if the ownerid field is the same for more than one document, that's a duplicate and should be removed.

[05:05:26] <joannac> erm, okay

[05:05:39] <joannac> I wouldn't be running an aggregation on 40 shards' worth of data in production

[05:06:04] <Ceseuron> We're down to 7

[05:06:12] <joannac> but you still need to drop the unique index on ownerid to get duplicates in the first place

[05:06:13] <Ceseuron> Guess how many weeks I've been doing this?

[05:06:45] <Ceseuron> Let's just say waiting in line at the DMV is preferable to this...migration

[05:07:56] <Ceseuron> Hrm...

[05:08:02] <Ceseuron> http://stackoverflow.com/questions/15323360/mongodb-unique-index-on-date-field/15323839#15323839

[05:08:24] <Ceseuron> The second answer...aggregate query..would that do something in this situation?

[05:09:57] <joannac> the aggregate will show you dupes... i think

[05:10:11] <joannac> assuming you're willing to take the potential performance hit

[05:11:47] <Ceseuron> The collection that this would run on is 168GB in size, according to the "show dbs" command on Mongod

[05:12:05] <joannac> I'm not even sure it will work... if you were inserting against shards directly

[05:12:46] <Ceseuron> My apolgies, by the way, if this is kinda irritating for you.

[05:12:59] <Ceseuron> It's no picnic for me either.

[05:12:59] <joannac> no, I'm just not 100% sure what the situation is

[05:13:07] <joannac> and you're asking me for advice

[05:13:30] <joannac> so I'm not sure what to tell you, that's all

[05:13:36] <Ceseuron> Understood.

[05:14:06] <Ceseuron> I don't believe we talk to the shards directly.

[05:14:20] <Ceseuron> There are mongos routers that handle all that.

[05:15:46] <joannac> then I don't know how you got dupes

[05:16:46] <Ceseuron> I'm not entirely sure either. All I know is that there was an incident in days long since past where they ran into a problem with two primaries in the same replicaset.

[05:16:58] <joannac> Hrm

[05:16:59] <Ceseuron> The problem was never dealt with then in terms of the data.

[05:17:01] <joannac> What version?

[05:17:27] <Ceseuron> 2.2.6

[05:17:32] <joannac> dude

[05:17:36] <joannac> I hope you're upgrading

[05:17:46] <Ceseuron> That's happening once this is done

[05:17:55] <joannac> Okay, so, even if there were 2 nodes in the same replica set, you shouldn't get dupes

[05:18:44] <Ceseuron> I'm not sure what the configuration was before. Now it's a primary, secondary, and an arbiter for what I assume is a quorum role.

[05:18:49] <joannac> yes

[05:19:04] <Ceseuron> It was never supposed to grow in VM land

[05:19:46] <Ceseuron> When I got the project, there were 40 shards on 80 VMs. Before that, I believe it was more like 70 or 80 shards.

[05:19:59] <Ceseuron> And an entire ESXi cluster. driving it.

[05:20:44] <Ceseuron> Had this been my project from the get go, it would have never made it into VMs for the primary and secondary. I would have demanded physical assets be used.

[05:26:30] <joannac> I think I have a similar script to do this kind of thing

[05:27:36] <joannac> I don't have the resources to modify it for your environment though

[05:27:51] <joannac> Maybe post on GG or SO and see if someone has the resourcing for it?

[05:28:43] <joannac> This is the kind of thing you would get with a support contract with MongoDB :)

[05:28:50] <Ceseuron> We purchased one

[05:28:51] <Ceseuron> :D

[05:28:54] <joannac> oh

[05:29:00] <joannac> then why aren't you opening a ticket?

[05:29:48] <Ceseuron> It's 10:30PM my time

[05:29:54] <joannac> And?

[05:31:07] <Ceseuron> Didn't think they'd be open at this hour.

[05:31:16] <joannac> Dude

[05:31:21] <joannac> We run 24/7/365

[05:31:50] <Ceseuron> Mongo is serious busienss, eh?

[05:31:56] <Ceseuron> business...

[05:32:07] <joannac> Yes, people run it in production.

[05:34:01] <joannac> Open a ticket. You have a production issue. You're paying MongoDB for a reason...

[05:38:41] <Ceseuron> Of course, now I have to dig up where the hell we store tha tinfo

[05:38:43] <Ceseuron> that info

[05:49:47] <mgeorge> can you create a user for a database that has read/write access only to a specific collection?

[05:53:11] <joannac> in 2.6, yes

[05:53:13] <joannac> i think

[05:54:57] <mgeorge> i think i'll just use a seperate db

[05:55:33] <mgeorge> building out a session.class.inc in php so it stores all session data in mongodb

[05:55:41] <mgeorge> web servers will be load balanced behind an F5

[06:06:31] <quuxman> If I have a query like so: {foo:'toggle1', bar:'toggle2', baz:'thing'}, sort=[('updated',-1)], and I have an index of {baz:1, updated:-1}, why can't this query use my index?

[06:06:53] <quuxman> I don't want to add foo and bar into my index, because they only have a handful of values

[06:07:25] <quuxman> Should I create an {{baz:1, foo:1, bar:1, updated:-1} index?

[06:08:04] <joannac> you want an index on {updated: -1, baz:1}

[06:08:59] <quuxman> but if I take out the other clauses, it uses my existing index

[06:09:00] <joannac> wait

[06:09:36] <joannac> pastebin the result on explain(true)?

[06:11:37] <quuxman> joannac: http://pastebin.com/PNV3G4C9

[06:11:42] <quuxman> joannac: thanks for looking at it :)

[06:12:05] <quuxman> ah wtf?

[06:12:09] <quuxman> now it's working

[06:12:14] <quuxman> I didn't even look at the output this time

[06:12:27] <joannac> :p

[06:12:34] <quuxman> now it's 6ms instead of 1635

[06:12:55] <quuxman> the only thing I changed was adding the index you suggested, but it's not even using it

[06:13:31] <quuxman> oh that's why

[06:13:33] <quuxman> I pasted the wrong explain

[06:14:03] <joannac> an exaplin from the shell would be better

[06:14:05] <joannac> easier to read

[06:15:41] <quuxman> joannac: http://pastebin.com/6STZF4uU

[06:16:11] <quuxman> kk, I'll connect with mongo client. It's easier for me to use my DB code because I'm used to it

[06:16:34] <joannac> quuxman: there's no sort there

[06:17:11] <joannac> also I want explain(true), not explain()

[06:18:26] <quuxman> joannac: the sort is added by that function

[06:19:37] <quuxman> oh godamn, it's too late. stupid mistake

[06:20:33] <quuxman> allright, onto the next entry in the slow query log :-/

[06:24:25] <quuxman> OK, enough with the specific examples. I need to understand the theory of how Mongo uses indexes with $and and $or

[06:27:36] <quuxman> say I have a clause like: {$or: [{foo: 'blah'}, {bar: 'blahblah'}, {small_set:2, baz:{$in: [ID1, ID2, ..., ID10]} ]}

[06:28:29] <quuxman> and I have 3 indexes that start with foo, bar, and baz. Should this be able to use these 3 indexes? (my test says no)

[06:29:30] <quuxman> I would expect the planner to do three separate queries for each clause in the $or, stick them together, and then sort (on updated, which wouldn't use an index)

[06:31:47] <quuxman> hm, that's actually what it's doing, based on what I can understand from the explain, it's just scanning far more rows than I'd expect

[06:31:57] <quuxman> I should probably create a foo_bar_baz index?

[06:34:31] <quuxman> wow, using .explain(true) with the shell gives me way more information than with pymongo.

[06:36:24] <quuxman> would creating a foo_bar_baz index not help at all, because it's an $or clause?

[06:36:37] <quuxman> I'm fairly certain that's what I'd want to do if it was $and

[06:43:36] <quuxman> I'm reading about index intersection which is straightforward to understand, but I don't know whether it applies to $or

[06:45:53] <quuxman> my intuition says no

[06:57:22] <chovy> it seems all my users are ending up in admin databaase, even when I do `use myDb` before creating a suer

[06:57:25] <chovy> user

[07:00:45] <joannac> using 2.6?

[07:03:46] <chovy> joannac: yeah

[07:04:20] <joannac> that's expected

[07:05:39] <chovy> ok

[07:06:29] <chovy> what is the Users collection on a db then?

[07:07:02] <chovy> i have Functions, Collections, and Users

[07:08:26] <joannac> that's your collection?

[07:08:46] <chovy> no

[07:08:51] <chovy> its showing in robomongo

[07:08:55] <chovy> i'm not too sure what those are

[07:09:01] <chovy> i know my data is under Collections

[07:09:16] <joannac> does it show up in the mongo shell?

[07:09:29] <chovy> how would i query it?

[07:09:38] <chovy> i'm not sure what robomongo is doing to show that list

[07:09:46] <joannac> opena mongo shell, then "use databasename; show collections"

[07:11:18] <chovy> my admin users can't query my db

[07:11:25] <chovy> weird

[07:11:30] <chovy> i did use admin

[07:11:35] <chovy> then db.createUser()

[07:11:54] <chovy> is there a way to create a system wide admin?

[07:11:59] <chovy> do i have to name every db?

[07:12:17] <chovy> https://gist.github.com/chovy/f35b56b61962c53eccec

[07:12:20] <chovy> this doesn't appear to work

[07:13:01] <joannac> that's a user admin

[07:13:07] <joannac> it lets you administrate users

[07:13:18] <joannac> i.e. add users, remove users, modify user's privileges

[07:13:22] <chovy> ok

[07:13:26] <chovy> he can't query dbs?

[07:13:31] <joannac> you want dbAdminAnyDatabase

[07:13:45] <chovy> just add another json object below that one?

[07:14:40] <chovy> joannac: https://gist.github.com/chovy/f35b56b61962c53eccec

[07:14:44] <joannac> chovy: http://docs.mongodb.org/manual/reference/method/db.grantRolesToUser/

[07:14:49] <chovy> will that get me a super user who can add users and query dbs?

[07:15:01] <joannac> yes

[07:15:49] <joannac> but wait

[07:15:53] <joannac> no

[07:16:02] <joannac> you also want readWriteAnyDatabase

[07:16:21] <joannac> it also doesn't include cluster roles

[07:16:26] <joannac> like replica set or sharding commmands

[07:17:12] <chovy> ok

[07:17:16] <chovy> i don't have any of that yet

[09:28:03] <chrishill> meteor

[09:56:15] <IPhoton> hello, I would like to know if it's better for learning to install mongodb locally or use something like compose.io

[09:56:44] <IPhoton> I heard that locally it does take a lot of space, like 300mb, but is that for the installation or per project?

[10:15:50] <safsoft> Hello

[10:16:07] <safsoft> Guys a very big problem

[10:16:19] <safsoft> we have bg data

[10:16:23] <safsoft> millions of documents

[10:16:45] <safsoft> when trying to launch a query

[10:16:55] <safsoft> the cursor freeze at 4Mo of returned data

[10:17:12] <safsoft> then it takes too long to continue the second part of the query

[10:17:46] <safsoft> anybody have an idea on how to force mongo to get data without loading too much ?

[10:17:52] <safsoft> maybe with cache?

[10:18:12] <safsoft> or cursor.batchSize

[10:18:13] <safsoft> ../

[10:18:14] <safsoft> ?

[10:18:21] <safsoft> any help?

[10:19:09] <safsoft> Any mongo Expert around here ?

[10:19:27] <safsoft> How to deal with Big queries?

[10:21:27] <safsoft> did someone faced the issue ?

[10:21:36] <safsoft> how to deal with Big queries ?

[10:21:57] <safsoft> is there a way to tell mongo to continue returining data without freeze ?

[10:25:38] <safsoft> what is this problem of 4Mo limit ??

[10:25:52] <safsoft> can someone explain ?

[11:49:35] <safsoft> Hello

[11:49:45] <safsoft> Is there traffic on this channel?

[11:49:55] <safsoft> Where we can ask about mongoDB ?

[11:50:00] <safsoft> cause no one is answering

[12:02:22] <yopp> safsoft, you are on wrong side of the planet then :)

[12:18:00] <joannac> safsoft: ask your question

[12:18:13] <joannac> if you had asked maybe I could've answered by now :)

[12:46:10] <cheeser> what is a "4Mo limit?"

[12:46:32] <joannac> cheeser: Mb?

[12:47:53] <cheeser> that would make more sense. sort of.

[13:34:32] <Salyangoz> hello

[13:46:53] <safsoft> HEllo

[13:50:43] <safsoft> Can I ask my question ?

[13:51:09] <cheeser> of course

[13:51:17] <safsoft> Heyy

[13:51:23] <safsoft> We have a problem

[13:51:36] <safsoft> we have a query that returns nearly 7000 documents

[13:51:51] <safsoft> if we launch it

[13:51:52] <cheeser> just please don't press enter until you're done with your thought.

[13:51:57] <safsoft> ok

[13:54:28] <safsoft> the collection contains about 2 millions on documents, and indexes are placed correctly, so we have executed a query with .limit(10000), this query will return ~4800 documents and freeze searching, and after long time it returns the last part of the result

[13:55:42] <safsoft> If we try the .count() on this query is freeze completely, is there a way to accelerate at least the count ?

[13:56:22] <safsoft> Please notice, that we execute the query directly on the console in a .js file

[13:58:11] <safsoft> do you have questions ?

[13:58:20] <Salyangoz> I do but its semantic.

[13:58:35] <Salyangoz> are there any db/collection naming conventions in mongodb

[13:58:50] <Salyangoz> e.g. I use pep-8 for naming conventions in python

[14:01:40] <hicker_> Mongoose/Express question here: Why is a blank document created after executing Customer.create({ customerName: 'John' })? http://pastie.org/9686889

[14:07:43] <Terabyte> hey

[14:08:33] <Terabyte> I have a document I'm persisting to mongodb, the document contains subsections which I might want to query for, and update. Do these subsections count as documents in their own right?

[14:10:31] <tscanausa> Terabyte: no

[14:10:36] <Terabyte> When modelling I had 2 classes, parent object modelling the person, and a list of child objects. I gave the child objects an _id value, and told mongo via jongo @Id, @ObjectId annotation that these needed their ID to be generated, but for some reason mongodb only generates the parent Id...

[14:10:40] <Terabyte> am I "doing it wrong"?

[14:11:22] <cheeser> "subdocuments" don't get/need their own _id

[14:11:40] <cheeser> so, yes, you're doing it wrong.

[14:11:45] <safsoft> have you a method to accelerate teh count() ?

[14:11:46] <tscanausa> Well they can, but they dont automatically.

[14:11:48] <cheeser> wrongly :)

[14:11:53] <Terabyte> i see

[14:12:00] <Terabyte> so, if a user is editing a subdocument and wants to tell the server that field a on that subdocument what's the approach taken?

[14:12:12] <Terabyte> needs to be updated*

[14:12:20] <tscanausa> update the whole document

[14:12:26] <safsoft> seems that my problem on Performance does not interest nobody :(

[14:12:28] <cheeser> you query by the enclosing document's _id and update

[14:12:38] <cheeser> safsoft: or no one has an answer

[14:12:45] <safsoft> :)

[14:13:38] <tscanausa> safsoft: there is no way to make it faster. unless you are not using the right indexes.

[14:15:24] <safsoft> indexes are rights, we have take the time to optimze the indexes

[14:15:50] <safsoft> is there some feature in mongo specific to count

[14:15:57] <safsoft> such as B-tree indexes ?

[14:17:06] <tscanausa> safsoft: if you are using the correct indexes then there is nothing more one can do, other then hardware performance.

[14:17:30] <Terabyte> sorry, I still don't understand how one can identify a subdocument without an id, even if the parentid is known that doesn't allow you to tell the server "i updated 'that' one"

[14:19:24] <cheeser> safsoft: you've run an explain to make sure you're using the index you think you are?

[14:19:54] <tscanausa> Terabyte: I like to think of documents in mongo as a json hash that has an pointer to it ( aka the id ) and such how would I normally update a sub element in any json hash?

[14:20:01] <cheeser> Terabyte: well, subdocuments would need *some* kind of unique identifier. a mongodb/driver managed _id just isn't it in this case.

[14:20:11] <cheeser> what tscanausa said

[14:20:49] <Terabyte> cheeser so I should generate my ownid, based on my own hash? mmm was really hoping mongodb would manage that for me :(

[14:21:31] <tscanausa> Terabyte: there are other options such as changing your data model, but you need to understand the risk and reward.

[14:21:33] <Terabyte> alternatively i could flatten the structure

[14:21:38] <Terabyte> yeah

[14:23:41] <cheeser> Terabyte: unless there's some preexisting unique property

[14:25:59] <Terabyte> yeah the only way that would be possible is if on serialisation/deserialisation the elements knew their own position in the array.

[14:26:30] <Terabyte> even that's a pain because user sorts locally and everything's broken

[14:27:06] <Terabyte> see, originally I was calling new ObjectId() for every subdocument, but then there were issues serializing because ObjectId() doesn't play nicely with jackson.

[14:27:15] <cheeser> positionally dependent updates are usually a code smell

[14:28:01] <Terabyte> agreed, my relational model had an autoincrement (sub documents had their own table), so this was avoided in this way.

[14:28:31] <tscanausa> sub documents could actually be a different collection

[14:28:59] <Terabyte> yeah, i thought that by doing that I was missing out on the doc way of doing things

[14:29:45] <tscanausa> its all about the trade offs.

[14:32:12] <smik> How to query 'man.property[0].size' > 300 ?

[14:32:27] <smik> "property" is the array.

[14:32:42] <smik> I want to know if the fisrt property of the man has size greater than 300

[14:33:41] <tscanausa> I am not sure you can smik.

[14:34:28] <smik> tscanausa: Damn

[14:34:32] <tscanausa> also doing thing off position in mongo is not great. mongo can / used to reengage the position of arrays.

[14:35:10] <smik> I don't understand tscanausa . Why isn't doing things off position in mongo a good thing? What's the alternative

[14:35:11] <smik> ?

[14:35:25] <smik> I mean if I have like a product which is in three sizes

[14:35:34] <tscanausa> mongo can / used to reengage the position of arrays.

[14:35:55] <smik> tscanausa: what is "/"

[14:36:23] <tscanausa> or

[14:36:36] <tscanausa> can or used to rearrange arrays

[14:41:04] <Terabyte> cheeser so I've decided that on create I'm going to generate a UUID (using randomUUID), store that as a string, shouldn't be a problem for serialization that way. any thoughts? (performance or uniqueness i guess would be the only things)

[14:41:15] <Terabyte> (generate a UUID in java that is)

[14:44:55] <cheeser> ewww. creating an ObjectId is probably faster. ;)

[15:14:38] <Terabyte> hey, say I have {docid:1 array:[{arrayid:1, subArray:[{},{}]}], how do I push onto the subarray belonging to arrayid1 of docid1?

[15:16:22] <Terabyte> found the $push feature but not sure how to describe an 'xpath-esque' {$push:array(arrayId=1).subarray{#}}

[15:17:16] <tscanausa> Terabyte: { $push : {'docid1.array':'newthing'}}

[15:18:03] <Terabyte> doesn't that append to the array rather than the subarray?

[15:23:41] <Terabyte> got it :)

[15:24:01] <Terabyte> {$push:{array.$.subarray:#}

[15:24:10] <Terabyte> "positional" operator apparently

[16:22:17] <Terabyte> gah! no x.$.y.$?

[17:07:18] <daidoji> hello, anyone got some sample code to look at to understand how to deal with BulkWriteErrors I can take a gander at?

[17:35:28] <daidoji> nevermind I got it

[19:08:54] <rekataletateta> db.myColl.find() how do I group this by "user"

[19:09:31] <rekataletateta> db.myColl.find().group("user")

[19:09:35] <rekataletateta> like this?

[19:14:08] <nicolas_leonidas> is there a way to write a query that would return true false values based on whether or not a key exists in each document?

[19:31:21] <skot> nicolas_leonidas: see the aggregation framework: http://docs.mongodb.org/manual/core/aggregation-introduction/

[19:31:39] <skot> I think you want $cond/$isnull

[19:32:27] <skot> rekataletateta: db.myColl.group -- http://docs.mongodb.org/manual/reference/method/db.collection.group/

[19:32:50] <rekataletateta> I know, but I need more help

[20:15:53] <defk0n> does anyone here have experience with pymongo? Im trying to create a ObjectId() but when i try to insert it into db it says OverflowError: BSON can only handle up to 8-byte ints

[20:16:13] <defk0n> what the freak

[20:24:41] <defk0n> ok so apparently i cannot just call ObjectId() to get myself a ID, i need to do ObjectId(json.loads(dumps(ObjectId()))['$oid'])

[20:24:43] <defk0n> freaking logic

[20:28:29] <cheeser> say what?

[20:50:34] <nicolas_leonidas> hi I'm using $cond and I wanna know if an array is empty or not with $if here's what I have http://paste.debian.net/129641/

[20:56:26] <nicolas_leonidas> nm I had an extra bracket

[20:56:39] <nicolas_leonidas> typo cause of 99% of my problems I have fat fingers

[21:48:00] <flyingkiwi> hey guys! today we added a second shard to our cluster. everything seems good. buts thats the point - i really dont know

[21:48:53] <flyingkiwi> NICs bandwith seems to be low. like 3mbps. is there a way to get an overview about what has to be move and how the progress is?

[21:50:35] <flyingkiwi> the documentation states "it will take some time", buts thats fairly relative to me :)

[21:53:11] <joannac> flyingkiwi: check sh.status() and see if chunks are moving

[22:01:10] <IPhoton> hello I posted a question yesterday but it didn't get answered. I would like to know if it's better to use mongodb locally for development/practice, or to use a remote service like mongohq. From what I heard, it takes a lot of space to install mongodb, which is about 250mb, now this is 250mb per project or just for the installation?

[22:01:26] <IPhoton> If it is just a one time install, I think it's not such a big deal, right?

[22:03:18] <flyingkiwi> joannac, s02.c01 17

[22:03:18] <flyingkiwi> joannac, s01.c01 9327

[22:03:18] <flyingkiwi> joannac, ew, seems pretty slow.

[22:04:03] <joannac> how busy is the server?

[22:04:32] <flyingkiwi> joannac, in terms of ops/s or...?

[22:04:47] <flyingkiwi> joannac, the servers are not overloaded in any case

[22:06:02] <flyingkiwi> don't get me wrong. I do not expect you saying "ou, this will exactly take 499 hours and 32 seconds". I just want to figure out a method to get an overvierw about the balancing status

[22:06:25] <joannac> see how long it takes to move a chunk, and extrapolate

[22:06:45] <joannac> IPhoton: 250mb?

[22:07:17] <flyingkiwi> and that sh.status() thingy looks quite okay. Can I get more details about whats going on with some special property like {beMegaVerboseAndTellMeHowLowItWillTake: true}?

[22:07:30] <IPhoton> well the installation on my drive says about that, joannac

[22:08:02] <joannac> well yeah, but that doesn't actually include database files and stuff, IPhoton

[22:08:03] <IPhoton> I have installed it on Windows and Linux, and the data/db folders I created are that size

[22:08:27] <joannac> flyingkiwi: nope, check the logs

[22:10:24] <IPhoton> so for learning you feel it's best to install locally, joannac?

[22:10:39] <IPhoton> I am trying to follow some of the MEAN stack tutorials

[22:10:49] <joannac> yes, I think so

[22:10:57] <joannac> I don't know how much mongohq expose to you

[22:11:32] <IPhoton> well I seen a tutorial and they use mongohq, but that's after the yo generator with all the configs

[22:12:05] <IPhoton> Okay, will just use the local install for now, then I will change if needed.

[22:12:09] <IPhoton> thanks, joannac

[22:12:33] <joannac> flyingkiwi: connect to a mongos in the mongo shell, use config; db.changelog.find({}, {what:1, time:1})

[22:13:59] <storyleforge> Hey there. I am trying to lay out the architecture for a scalable realtime analytics platform. Mongo seems like a great choice in terms of functionality. I'm looking at using heroku for this, which offers mongodb addons that can be kind of expensive in terms of storage space. I realize that this is a scaled solution and that's what I'm paying for vs a lot of space on a single instance. i'm thinking about moving older data into

[22:14:12] <storyleforge> be missing out on speed from a sql database, or just ease of use

[22:15:22] <joannac> you question cut out at "..moving older data into "

[23:37:25] <darkblue_b> story

[23:37:55] <darkblue_b> real-time ... can mean different things to different audiences

Log file Viewer

Help | Karma | Search:

#mongodb logs for Friday the 31st of October, 2014