PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Monday the 2nd of December, 2013

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:10:42] <topwobble> joannac: thx. using multi + upset and taking our server down for approximately 7 minutes (we shall see ;)
[00:22:10] <joannac> topwobble: multi + upsert? did you test it?
[00:24:15] <topwobble> joannac: yes tested on our staging server w/ 150k documents. Will try on prod w/ 1.5 million docs soon
[00:24:51] <joannac> okay. upsert + multi was not what I had in mind. but if it works for you, good
[00:25:21] <topwobble> joannac: what did you have in mind then?
[00:25:42] <topwobble> we calculate this will have ~7 minutes of downtime for us, which is acceptable
[00:26:13] <joannac> I don't understand why you need the upsert flag
[00:26:30] <cheeser> up with serts!
[00:26:52] <joannac> cheeser: you're such a troublemakker
[00:27:44] <topwobble> not sure either ;) you told me to!
[00:29:05] <topwobble> upset not needed because I am just setting the value anyways?
[00:29:08] <topwobble> upsert*
[00:29:31] <joannac> if you do $set, you shouldn't need the upsert flag
[00:29:39] <topwobble> joannac: right, makes sense
[00:30:07] <topwobble> joannac: does it hurt? I only ask because we already tested in staging w/ upsert and would prefer to not change anyying, nor re-test
[00:30:28] <joannac> I guess not
[00:30:48] <topwobble> ;D good enough for me
[00:30:57] <topwobble> thank god i'm my own boss
[00:31:35] <cheeser> could be awkward if you ever have to let yourself go, though.
[00:32:32] <topwobble> ^^
[00:32:43] <topwobble> its just not working out anymore
[00:33:03] <cheeser> it's not you it's well, it's you actually
[00:34:24] <topwobble> you sexually harassed yourself
[01:58:18] <kurtis> Hey guys, I've ran into something unexpected. I have compound _id's which should ensure I have unique entries. However, I have some duplicates. I *believe* this is simply due to the ordering of the items in the _id -- but I'm not 100% sure.
[02:02:56] <kurtis> Here's the entries in my collection. I really don't understand why they were duplicated. The only difference I see is the ordering of the elements: https://dpaste.de/qLbw
[02:11:10] <kurtis> I'm looking in another collection where I've save()'d entries (assuming they would update) and they are also duplicated -- however, their field order *is* identical. Any ideas on what might be going on?
[02:12:05] <kurtis> Actually, scratch that last one -- the values I saw were slightly off so that makes sense. The first example still holds though. I'm not sure if the ordering affects the _id uniqueness
[02:16:37] <kurtis> I take that back. Regardless of ordering, I am having issues with duplicates. Here's a paste to show two collections: https://dpaste.de/H4h2#L
[02:17:49] <kurtis> :/ Okay, ignore that. I'm obviously looking too fast. The _id in my second collection is definitely different for one of the fields. I think I've been staring at this for too long. It must be the ordering.
[02:42:29] <kurtis> Okay, I've investigated and that's the only thing I can see that's different. The problem is that I'm using Python and I don't believe it will be easy or efficient to ensure the ordering of these _id fields.
[02:52:41] <kurtis> I was able to get around said problem using bson.son.SON objects. However, this data-transformation seems unnessasary (I'm reading from MongoDB and then writing back in). Also, it may add quite a bit of overhead to my app. If anyone has any suggestions, it'd be great to hear them
[03:10:44] <joannac> Use SON objects.
[03:10:59] <joannac> Or unique index on _id.field1, _id.field2
[03:11:05] <joannac> which should enforce uniqueness
[03:13:12] <kurtis> joannac: That index might be why it was consistent in one and not the other. Thanks! I'm going to use SON objects just to be safe
[03:14:26] <kurtis> joannac: Is there an actual use-case where an _id might have exactly the same data but a different order? If not, would it make sense to modify MongoDB for the expected behavior when dealing with _id's uniqueness?
[03:14:43] <kurtis> Or is this something really low level? I'm not too familiar with Mongo's internals
[08:11:06] <ruphos> I'm finding conflicting information in the documenation with regards to using ec2 snapshots to backup raided ebs volumes on a sharded replicaset. Is it possible to gain a consistent snapshot with this method?
[08:14:29] <ruphos> This says no: http://docs.mongodb.org/manual/tutorial/backup-databases-with-filesystem-snapshots/#snapshots-with-amazon-ebs-in-a-raid-10-configuration
[08:14:40] <ruphos> but this one says yes: http://docs.mongodb.org/ecosystem/tutorial/backup-and-restore-mongodb-on-amazon-ec2/#ec2-flush-and-lock-database
[08:36:21] <[AD]Turbo> hola
[13:35:02] <felipefdl> hi, how to use agreggation with differents timezone?
[13:35:20] <Derick> felipefdl: you can't really do that right now :-/
[13:35:56] <felipefdl> Derick, what solution do you recommend to solve this problem?
[13:36:47] <Derick> store the data in the timezone you want to process it on?
[13:37:55] <felipefdl> but mongodb uses UTC is not so?
[13:38:28] <Derick> yeah - just to check, what are you trying to do/accomplish?
[13:41:14] <felipefdl> I'm building a software analytics, and must seek information from different timezone.
[13:41:37] <Derick> in grouping?
[13:41:57] <felipefdl> yes
[13:42:28] <felipefdl> maybe I think use this:
[13:42:35] <felipefdl> db.dates.aggregate(
[13:42:35] <Derick> but for a single group aggragation run, do you nee dto handle multiple timezones?
[13:42:35] <felipefdl> {$project: {"tDate":{$add: ["$tDate", 60*60*1000]}, "eventCount":1, "customer":1}}
[13:42:35] <felipefdl> )
[13:42:38] <felipefdl> http://stackoverflow.com/questions/18852095/how-to-agregate-by-year-month-day-on-a-different-timezone
[13:42:41] <Derick> felipefdl: please use a pastebin!
[13:42:52] <felipefdl> 3 lines
[13:42:58] <felipefdl> ok, sorry
[13:43:02] <Derick> doesn't matter, IRC mangles layer
[13:43:05] <Derick> layout*
[13:44:08] <felipefdl> Derick, i need single timezone per agreggation
[13:44:23] <Derick> felipefdl: yeah, then the trick with $add will work
[13:45:13] <felipefdl> Derick, and about loss of performance, you think you'll have?
[13:46:28] <Derick> there will be some of course as more things need to be run
[13:47:38] <felipefdl> Derick, ok, thanks
[15:55:21] <tiller> hi there!
[15:56:52] <tiller> Is it possible to retrieve with a find only the subdocument that matched our query, but, by retrieving also others fields, and without naming them into the query?
[15:57:12] <tiller> for example, with this document: {a: 1, b: [ {c: 1}, {c: 2} ] }
[15:58:09] <tiller> I want {a: 1, b: [ {c: 1} ]} (with or without the array, I don't care). But if I use db.test.find(..., {"b.$.c": 1}); the field "a" won't appear in the result
[15:58:54] <tiller> Do I HAVE TO use an aggregate with an unwind to do so?
[15:59:01] <MatheusOl> tiller: {"b.$.c": 1, a: 1}
[15:59:31] <tiller> MatheusOl> Yes, but as I said, I don't want to specify others fields name in the field selection
[15:59:47] <tiller> I want "every fields, and only the sub-document that matched"
[16:00:16] <tiller> (or as I tried to say*)
[17:25:07] <nmschulte> Is there support for MongoDB drivers in this channel? e.g. MongooseJS?
[17:55:30] <maginot> hello. I think I did something very wrong. I did this db.shards.update({ _id : 'shard0000'}, { $set : { 'host' : 'mongo1.shardserver' }}, { 'upsert' : true }) and now I can't access mongo console
[17:56:57] <maginot> can anybody help me? It's really an emergency :(
[19:02:17] <Guest58437> Hello, I'm trying to convert this group() query into aggregate(): db.relationships.group({keyf:function(doc) { return {sourceId:doc.sourceId} }, initial: {count: 0},reduce: function(curr, result) { result.count += 1; }})
[19:02:54] <Guest58437> I tried "db.relationships.aggregate([ {$group:{"_id":{Id:"$sourceId"}, total: {$sum: 1}} }, {$project:{"sourceId":1}} ])" , but not sure if that's right
[19:51:26] <seban> According to http://docs.mongodb.org/manual/reference/limits/ BSON document can't be larger than 16MB. Is it permanent state or I can configure it somehow? Is is possible to increase maximum size of BSON doc in future?
[19:52:15] <kali> seban: it used to be a #define in the code. you need to recompile mongodb to change it
[19:52:48] <kali> seban: it's not recommended.
[19:53:11] <rafaelhbarros> seban: if you want to change that, please revise your architecture.
[19:54:05] <kali> yeah. you'd better re-think your schema
[19:54:33] <rafaelhbarros> if you really wanna store that, use gridfs, which I also advise against.
[19:59:49] <paulkon> seban: don't nest anything that grows without bounds
[19:59:54] <paulkon> you will run into problems
[20:04:00] <rafaelhbarros> borrowing the zen of python's line
[20:04:19] <rafaelhbarros> "flat is better than nested"
[20:04:27] <rafaelhbarros> "sparse is better than dense"
[20:06:16] <seban> kali, rafaelhbarros thanks. I was just curious about it. I hit this today when I ran aggregation on my sharded cluster.
[20:07:27] <rafaelhbarros> flatten that out
[20:07:40] <rafaelhbarros> =)
[20:57:02] <klj613> downloads of the centos/rhel rpm is verrry slow :(
[21:01:21] <wjb> Is db.serverStatus( { workingSet: 1 } ) performant?
[21:01:48] <cheeser> that's a question without context or a real answer.
[21:02:46] <wjb> Busted. The context is whether or not I should run it against our prod database. What do I need to know to determine the answer?
[21:03:10] <rafaelhbarros> environment variables man...
[21:03:35] <rafaelhbarros> wjb: when you want to run on prod?
[21:04:18] <wjb> When? Whenever I can. Are you saying to run it at a low utilization time?
[21:05:55] <rafaelhbarros> oh
[21:06:05] <rafaelhbarros> you want to run something when the load of your server is low, is that it?
[21:07:04] <rafaelhbarros> wjb: you can have a crontab job that checks the # of connections
[21:07:13] <rafaelhbarros> wjb: I don't know if that's the right way tho
[21:08:13] <lbjay> anyone know is the continue_on_error option to pymongo's collection.insert() depend on a particular version of mongodb?
[21:09:56] <Derick> lbjay: it might be.... which version are you running?
[21:10:10] <lbjay> hmm, nm. appears that's not the problem.
[21:31:55] <azathoth99> hi
[21:32:04] <azathoth99> if I start node 1 of a 3 mongodb set
[21:32:12] <azathoth99> with simple replicaset basic setup
[21:32:20] <azathoth99> and copy some files to its /data/mongo
[21:32:34] <azathoth99> then start node 2 and 3 a few min later
[21:32:48] <azathoth99> will #1 be the 'primary' automatically?
[21:33:07] <jyee> yes
[21:33:21] <Guest99191> Hello ! If I execute the following update command on a doc {$addToSet: {'tags': {$each: [t1,t2,t3]}}} and at the same time a read is made on that doc, is it possible that the read will return [t1,t2] instead of [t1,t2,t3]? How does the $each operator work? Thanks!!
[21:34:04] <kali> Guest99191: updates to one document are atomic
[21:34:37] <jyee> azathoth99: or more specifically, there would be an election and nodes 2 and 3 would see that node 1 has the newest, most complete data, so they'd both elect node 1
[21:35:52] <azathoth99> ah
[21:36:12] <azathoth99> thx
[23:07:54] <gimlet_eye> ok
[23:08:01] <gimlet_eye> droped 157g prod files on mongo1
[23:08:06] <gimlet_eye> start mongo 2 and 3
[23:08:10] <gimlet_eye> 1 rplica set among them
[23:08:12] <gimlet_eye> now
[23:08:23] <gimlet_eye> shoudl 1 take a bit to read all the 157g of new fiels from prod?
[23:08:34] <gimlet_eye> because it appears to NOT be the primary
[23:08:39] <gimlet_eye> according to a command I ran
[23:08:48] <gimlet_eye> db.isMAster()
[23:09:16] <gimlet_eye> and I thought it would be elected anyhow since it has new files, a lot
[23:10:24] <gimlet_eye> "info" : "loading local.system.replset config (LOADINGCONFIG)",
[23:13:01] <joannac> gimlet_eye: I'm confused. You dropped data on mongo1, and then started up 2 more servers?
[23:13:10] <joannac> So you have a 3-node replica set
[23:13:13] <gimlet_eye> 3 total
[23:13:19] <gimlet_eye> on 1 I added files from prod
[23:13:25] <gimlet_eye> the started mongodb
[23:13:28] <joannac> How did you add them?
[23:13:30] <gimlet_eye> then few min later started 2 and 3
[23:13:33] <gimlet_eye> rsync
[23:13:45] <gimlet_eye> "info" : "loading local.system.replset config (LOADINGCONFIG)",
[23:13:54] <gimlet_eye> I think 1 may still be reading the new data
[23:14:02] <gimlet_eye> and then will be elected?
[23:14:19] <joannac> erm
[23:14:26] <joannac> what's the status in the mongod.log ?
[23:17:10] <gimlet_eye> let me see..
[23:17:55] <gimlet_eye> Mon Dec 2 15:14:55.871 [rsStart] replSet info Couldn't load config yet. Sleeping 20sec and will try again.
[23:18:07] <gimlet_eye> it msut be scaning the new 157G
[23:18:08] <gimlet_eye> ????
[23:18:12] <gimlet_eye> or is it wedged?
[23:18:14] <gimlet_eye> hmm
[23:18:19] <gimlet_eye> hmmmmm
[23:19:48] <riiad> Hi everyone
[23:19:54] <riiad> im' having a
[23:20:26] <riiad> i'm having a problem with some concurrency requests that crashes PHP
[23:20:43] <riiad> can someone help me ?
[23:24:00] <gimlet_eye> try #php
[23:24:01] <gimlet_eye> ??
[23:29:43] <riiad> ok, but i think it comes from the php driver for mongodb
[23:36:59] <roryhughes> Hi, Sorry for the long question. I am designing a schema at the moment which stores an array of objects in a document. Each object in the array, for example, stores an A and B which are editable by the user of the web app. It also stores an element C which is metadata (not editable by the user). The problem I am having is figuring out how to update an object in the array if it has been changed but create a new one if it is new.
[23:37:56] <roryhughes> One method I thought of would be to generate random ids for each object created so I could look in up when the user submits a new version of the array and only update the A and B of each, but is there a more conventional/native way to do it?
[23:40:45] <rafaelhbarros> roryhughes: are you looking for upsert?
[23:41:13] <rafaelhbarros> roryhughes: http://docs.mongodb.org/manual/reference/glossary/#term-upsert
[23:41:35] <rafaelhbarros> you can use the _id that mongo creates by default
[23:42:08] <rafaelhbarros> roryhughes: http://docs.mongodb.org/manual/reference/object-id/, ObjectId is 12-bytes bson type
[23:42:49] <roryhughes> No not upsert but how can i use the _id in an object embedded in an array
[23:45:58] <rafaelhbarros> roryhughes: I dont understand that
[23:46:06] <rafaelhbarros> roryhughes: can you give me some context
[23:46:14] <roryhughes> rafaelhbarros: how can i use the _id in an object embedded in an array like {name: "foo", list: [ {A: "eggs", B: "john", C: 9} ]}
[23:46:35] <roryhughes> So if that ^ was an example of a document
[23:47:25] <roryhughes> I need to be able to receive a new version of the array and update only the objects in the list array that need updating
[23:47:53] <rafaelhbarros> findAndUpdate
[23:48:03] <rafaelhbarros> no...
[23:48:04] <roryhughes> And if there are two documents both with the same C and the user changed the A and B, there is no way of knowing
[23:48:07] <rafaelhbarros> you actually need to do a find
[23:48:11] <rafaelhbarros> then change the array
[23:48:12] <rafaelhbarros> and update
[23:48:14] <rafaelhbarros> that is the fast way
[23:48:32] <rafaelhbarros> well
[23:48:35] <rafaelhbarros> the document
[23:48:46] <rafaelhbarros> {'name':'foo', 'list': [...]}
[23:48:55] <rafaelhbarros> will always contain '_id':ObjectId(...)
[23:49:08] <rafaelhbarros> and you can always use that _id to refer to the document
[23:49:12] <roryhughes> Yes, but that's irrelevant
[23:49:25] <roryhughes> I am talking about referencing the documents inside the list
[23:49:35] <roryhughes> (not really documents)
[23:50:13] <gimlet_eye> ok
[23:50:35] <gimlet_eye> I copied /data/mongo from prod to a second 3 node set, now the new 3 node set thinks its prod
[23:51:15] <gimlet_eye> and I cant remove the nodes with rs.remove("blah:port")
[23:51:24] <gimlet_eye> gaaaaah!!