PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 19th of February, 2014

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:20:14] <proteneer> anyone use pymongo?
[00:35:19] <joannac> proteneer: yes. if you have a more specific question you should just ask
[02:23:16] <RaceCondition> is filtering by _id allowed to be $or-ed with filtering by other fields?
[02:24:28] <RaceCondition> asking because of this: http://pastebin.com/0bGuKEPd
[02:24:42] <RaceCondition> the query that uses $or is returning nothing, not even an error message
[02:25:05] <RaceCondition> and query whose resultset should be a subset of the first one, is returning 1 document, which is odd
[02:27:01] <cheeser> that first one is ANDing the JOB_ID match with the the $or array
[02:27:03] <ruphos> your $or syntax is incorrect: http://docs.mongodb.org/manual/reference/operator/query/or/
[02:27:14] <cheeser> you want both matches in the $or clause
[02:27:19] <ruphos> ^
[02:27:58] <ruphos> a lot of the $operators are best thought of as being in polish notation
[02:28:27] <cheeser> prefix style vs infix
[02:28:48] <RaceCondition> Champi: ah... OK; this query was generated by Lift-Rogue not myself... so it's a bug in the library
[02:29:57] <RaceCondition> Champi, ruphos: thanks for the reply
[02:36:06] <murkk> Anyone able to answer what I suspect to be an easy pymongo question around finding entries?
[02:36:49] <murkk> How does db.entries.find( { "expires": { $gt:"2014-02-10T15:10:00-06:00"}}) look in python?
[02:39:08] <murkk> entries = mongo.db.entries.find({"expires": { $gt:"2014-02-10T15:10:00-06:00"}}) is definitely not it
[03:04:14] <cheeser> i would think you'd need to create a date type rather than that string for a date comparison
[03:20:06] <murkk> I actually am on the python side: entries = mongo.db.entries.find({"cap:expires": { $lt: datetime.now()}}) and get a syntax error at $lt
[03:20:18] <murkk> I tried < in place of it
[03:38:50] <fourq_> Which property of var db = mongoose.createConnection(...); has the native Mongodb connection?
[03:39:15] <fourq_> oops,wrong chan
[03:40:47] <cheeser> murkk: you might quotes around that $lt. i know in the mongo shell you would.
[03:42:48] <fourq_> ahh db.Db
[03:54:11] <swak> Does collection.findOne only work for indexed stuff?
[03:56:38] <cheeser> swaagie: of course not. you can query on whatever you want.
[03:56:49] <cheeser> indexes just make it faster
[03:58:16] <rafaelhbarros> pymongo is not returning lastErrorObject when calling find_and_modify
[03:58:23] <rafaelhbarros> what am I doing wrong?
[03:59:27] <edsiper> is there any MongoDB C library that supports async queries ?
[04:00:16] <rafaelhbarros> oh, full_response is required
[04:36:08] <the8thbit> Does mongo distinguish between database structure and the actual information in the database?
[04:36:16] <cheeser> what?
[04:36:39] <the8thbit> It would be nice to have the structure of the database on github, without any of the elements in that structure
[04:37:07] <cheeser> you mean the schema?
[04:37:17] <the8thbit> oh, yeah
[04:37:29] <cheeser> well, that's up to your app. you can document that however you like.
[04:38:20] <the8thbit> cheeser: interesting
[04:39:25] <the8thbit> cheeser: what are some recommended schemas?
[04:39:39] <edsiper> is there any MongoDB C library that supports async queries ?
[04:40:02] <the8thbit> I think it would be nice to have the schema on github, but none of the content there, so that e.g., email addresses aren't shared publically
[04:40:34] <cheeser> the8thbit: depends on your app.
[04:40:51] <cheeser> edsiper: i'm not a C guy but i thought there was one. or maybe that it was under development.
[04:41:39] <the8thbit> cheeser: AGPL social networking
[04:41:44] <the8thbit> sort of
[04:41:58] <cheeser> well, don't do it how the diaspora people did. ;)
[04:42:59] <the8thbit> cheeser: ha, yeah, the goal isn't social networking, but its a very social-networkign-like app. User accounts, private messages, etc...
[05:05:06] <edsiper> cheeser, thanks, i will raise the question in the ML
[05:43:36] <murkk> cheeser: I believe quotes around the $lt did it, well the syntax error went away.
[05:43:41] <murkk> Thanks for your help
[09:08:27] <Zelest> It's like a MMORPG
[09:08:33] <Zelest> Without the dexterity
[09:08:38] <Zelest> In short, no dex.. or nodex :D
[09:33:14] <Nodex> :P
[10:26:28] <AlexejK> Hi, got a question regarding MongoDB-Java driver.. We have an app that is two different WAR files in one same app-server. Both WAR files create a MongoClient instances and it works fine in 2.11.4 version of the driver but with the 2.12.0-rc0 it seems to be throwing warnings" Unable to register MBean org.mongodb.driver:type=ConnectionPool,clusterId=3,host=localhost,port=27017. javax.management.InstanceAlreadyExistsExce
[10:26:28] <AlexejK> ption". I know this is an RC0 but trying to figure out what has changed and what we can do to avoid this, if this is an intended change int he driver
[10:29:06] <Derick> AlexejK: I would report it asap against http://jira.mongodb.org/browse/JAVA
[10:29:43] <AlexejK> Derick, thanks will do that.
[11:29:34] <MmikePoso> Hi, guys. Has anyone been able to capture mongo traffic and then replay it against another system using mongosniff?
[12:10:31] <adimania> hi. I have a standalone mongodb. I want to move it to replica set and add a slave node to it. I want to know that while the data propagates to the slave, will the mongo master be usable?
[12:24:07] <dmarkey> Any aggregation experts? :)
[12:24:38] <kali> dmarkey: just ask your real question
[12:26:05] <dmarkey> kali: well arent you delightful.
[12:28:54] <dmarkey> Anyway, does this aggregation make sense.. product_analytics.aggregate([ { '$group': { "_id" : "$metrics", 'average_logon_times' : { "$avg" : "metrics.logon_times" } } } ] )
[12:29:40] <dmarkey> $metrics being a list of dicts, that contains a logon_times key (which is just an int)
[12:30:27] <Derick> show the document please
[12:31:20] <dmarkey> Gime a sec, it's quite large, i'll create a smaller one
[12:34:44] <dmarkey> Derick: http://pastie.org/8748424
[12:35:00] <dmarkey> Of course, this is python, as i'm using pymongo.
[12:35:17] <Derick> looks ok - does it work?
[12:38:17] <dmarkey> Derick: No :( http://pastie.org/8748433
[12:39:31] <kali> dmarkey: it's a point of irc etiquette really. there are hundreds of people on the channel, you can reasonably expect that there is somebody who can answer questions about aggregation. on the other hand, asking that somebody will step up and present himself as an "expert" is bound to fail...
[12:39:43] <kali> dmarkey: so on irc, it's better just to ask.
[12:41:08] <Derick> dmarkey: what do you want to group *on*?
[12:41:41] <dmarkey> Derick: I just want to average, for all the sub documents in the metrics list.
[12:41:58] <Derick> but you're going to have more metrics lists?
[12:42:24] <Derick> do you want the average over all the items in all metrics lists in all documents?
[12:42:31] <joannac> You might want to group on _id:1 ?
[12:42:42] <Derick> joannac: I tend to use NULL, but yeah
[12:43:06] <Derick> but the unwind on $metrics is still needed I think
[12:43:15] <Derick> gtg now, thanks for taking over joannac ;-)
[12:43:25] <joannac> Derick: is there an actual difference?
[12:45:33] <joannac> But yes, unwind the array, then do your aggregation (assuming you want a global average)
[12:45:50] <joannac> or unwind then group on _id: "$_id" (if you want an average per document)
[12:48:32] <dmarkey> Hmm.. unwinding on $metrics seems to yield the same result..
[12:49:20] <dmarkey> aggregate([ {'$unwind' : "$metrics" } , { '$group': { "_id" : "$metrics", 'average_logon_times' : { "$avg" : "logon_times" } } } ] )
[12:49:56] <dmarkey> although, admittedly, I have no idea what i'm doing.
[12:53:31] <kali> dmarkey: what about: aggregate([ {'$unwind' : "$metrics" } , { '$group': { "_id" : "all", 'average_logon_times' : { "$avg" : "metrics.logon_times" } } } ] )
[12:55:36] <kali> dmarkey: with a $ before metrics.logon_times
[12:56:46] <d-snp> hi guys
[12:58:20] <d-snp> I have a blurb of text that gets inserted into the database a lot of times, so I want to deduplicate it, my current idea is to calculate a hash for it on the app side, and then do an upsert into the database, with the hash as a key
[12:58:34] <d-snp> is that a good idea, or should I use some mongodb built in thing to calculate the hash of the text?
[12:59:07] <_Nodex> there is no built in thing
[13:01:00] <d-snp> okey, so the hash function I'm using is ruby's String#hash, which is rather fast but I'm not sure if it returns unique enough keys
[13:01:15] <d-snp> I can't find any docs on it, could you recommend a hash function for this?
[13:01:24] <Nodex> I don't know ruby sorry
[13:01:24] <d-snp> perhaps just md5, it's fast too right?
[13:01:42] <Nodex> objectId's are garunteed to be unique
[13:04:08] <kali> d-snp: i would use md5. i would not trust ruby implementation to provide consistent hash values across ruby versions bump
[13:04:30] <d-snp> alright, thanks that's solid advice
[13:04:34] <kali> d-snp: also, if at some point you need to work on your DB outside of the ruby world, ...
[13:04:47] <d-snp> yeah that makes total sense
[14:02:09] <dmarkey> kali: :( {u'ok': 1.0, u'result': [{u'_id': u'all', u'average_logon_times': 0.0}]}
[14:03:06] <kali> dmarkey: even with the $ ?
[14:03:47] <kali> dmarkey: can you dump your document in the shell instead of python, so that I can try ?
[14:07:36] <dmarkey> kali: http://pastie.org/8748639
[14:09:37] <kali> dmarkey: http://pastie.org/8748648
[14:16:33] <dmarkey> kali: thank you
[14:21:23] <Nodex> just an fyi if you want a prettier print from a find() just add .pretty() .. db.foo.find().pretty()
[14:23:43] <d-snp> weird, did you guys know that if you have an _id that's not a valid ascii/whatever string a collection.update fails?
[14:24:53] <d-snp> I accidentily passed a raw md5 digest, and it didn't like it at all, perhaps it's a ruby driver thing
[14:25:23] <Derick> d-snp: you can use that as an _id - but not as a MongoID type as _id
[14:39:51] <alkar> and MMS experts around?
[14:42:47] <kali> d-snp: wrap it in a BSON::Binary
[14:43:33] <kali> d-snp: or use the string representation. i think Binary is not compatible with the aggregation framework, so maybe sticking to the string representation is a better idea
[14:54:17] <Number6> alkar: Ask your question - someone might be able to help
[14:56:00] <dmarkey> Is unwinding very memory intensive?
[14:56:55] <alkar> The MMS agent reports strange numbers of memory (way too much than what is guarranteed) for my mongod replica running in OpenVZ (latest version so it /should/ be safe). Is that something I have to worry about - is there anything that my VPS provider can configure for me to fix this?
[14:59:01] <Number6> alkar: What's your group name?
[14:59:12] <Number6> Feel free to /msg it to me, and I'll take a look
[15:17:38] <Number6> alkar: How do you mean strange?
[15:29:36] <remonvv> I'm curious; anyone any opinions on this : https://jira.mongodb.org/browse/SERVER-12694 . Current MongoDB approach seems broken.
[15:30:16] <kali> dmarkey: not per se, but it can lead to memory explosive pipeline if it is follow by a non streamable operator (like a sort or a non trivial group)
[15:35:06] <dmarkey> kali: thanks
[15:50:09] <movedx> We have a 2.4.6 cluster. .6 has a few security issues, so we want to upgrade to .9. What's the best way of upgrading to .9 across a cluster? Should I just shutdown the router process and then everything on the other boxes and upgrade?
[15:50:24] <movedx> I assume taking the router process down gracefully will precent data loss and future read/writes?
[16:10:01] <alkar> @Number6: currently the databases are empty, yet the secondary memory stats (res,virt,mapped) are much higher than the one in primary and are actually
[16:10:27] <alkar> much higher than what the VPS will be able to commit to
[16:11:08] <alkar> pardon the bad syntax there - touchy enter key - shout if it doesn't make sense
[16:16:34] <alkar> basically I dont understand why the numbers are so different between the primary and the secondary
[16:30:48] <blizzow> I have two mongos on the same instance. They are hidden members of replica set 0 and replica set 1. mongodb for rs0 catches up(slowly) and suffers virtually no slave lag. mongodb for rs1 consistently falls behind. Yesterday, I started the hidden member of rs1 and watched it catch up. I come in this morning and the member is 1400 seconds behind. I even tried stopping the other hidden members and it still does not catch up. I've e
[16:31:52] <blizzow> It's not hitting the disk very hard.
[17:23:54] <Number6> alkar: Apologies, I got called into meetings.
[17:24:30] <Number6> alkar: What hypervisor is your primary running on?
[17:26:06] <alkar> @Number6 no worries, same deal here - primary is dedicated
[17:26:39] <Number6> alkar: So no hypervisor?
[17:30:23] <alkar> @Number6: correct
[17:30:55] <alkar> shouldn't at least mapped memory be pretty much the same between replicas?
[17:34:28] <Number6> alkar: It should, but OpenVZ does funky stuff with memory, so having the numbers different between your non-OpenVZ and OpenVZ nodes is expected
[17:35:15] <Number6> I really really dislike that hypervisor because of it's memory management - although it's been a while since I've used it personally
[17:35:19] <alkar> @Number6: I suspect so, however mapped memory (if I understand it correctly) is the size of virtual address space that mongod uses to map the disk data - as such it should be consistent between the replicas
[17:36:22] <Number6> alkar: It should. Since MongoDB works this out using Linux system calls, it relies on the VM to correctly report memory usage
[17:36:23] <alkar> @Number6: I do as well. There's one thing it's definitely good at: overselling :) but that was my low budget solution for a replica set
[17:36:59] <Number6> alkar: If you notice between your openVZ hosts, the memory looks the same on both - so my money is on a reporting issue
[17:37:04] <Number6> At the OS level
[17:37:11] <alkar> @Number6: so I guess the only way to tell is migrate the data and run some tests - see how they behave
[17:37:55] <alkar> @Number6: btw, is there a way to view all 3 hosts on the same page when viewing the replica set graphs? (3 column thing instead of 2)
[17:39:59] <Number6> alkar: Just the replicaset page, under "hosts" >> replicaset name
[17:43:51] <alkar> @Number6: that page shows me only two of them :o
[17:45:32] <alkar> @Number6: unless I click on the third replica which then shows me that one by itself
[17:46:00] <Number6> alkar: Ok, try "Dashboards" >> "test". I did you up a really rough and ready one
[17:46:48] <Number6> Dashboards can be customsied under "Dashboards" >> "add chart" >> Selecting the desired options >> Give it a name >> Save
[17:55:01] <alkar> @Number6: so dashboard is the way to go cool tyvm! (delete yours by mistake - I thought wtf is "test" hehe)
[17:55:23] <alkar> @Number6: all's good mate cheers for the help in this, it probably has to do with the hypervisor
[17:55:55] <foobarius> Trying to install latest perl modules (0.702.2) on ubuntu 12.04 system. Test t/db.t line 52 failes with a Syntax Error. Is this a known problem? Is there a solution?
[18:01:40] <alkar> @Number6: one really final thing if you've got the time, is that inconsistency in oplog gb/hour meter in the new dashboard I made anything to worry about? It looks strange
[18:05:27] <Number6> alkar: That does look strange. Can you do a db.collection.count() on each member of the RS? If that server is off, do an initial sync
[18:09:57] <foobarius> quit
[18:10:04] <cheeser> nevar!
[18:14:20] <Number6> cheeser: Only when you're ahead!
[18:16:38] <cheeser> which is all the time ;)
[18:16:56] <Nodex> ++
[18:17:03] <cheeser> oh. all ahead. not all head. i, uh, i have a large cranium.
[18:17:07] <Nodex> sharp like a razor :P
[18:20:46] <cheeser> great. now no one else can join the channel.
[18:25:14] <synesp> hey I'm looking for a solution(s) for versioning in mongodb
[18:25:47] <synesp> our application is in need of a way to track changes in collections - if I field is changed in a document, we need to know what fields and record the old values
[18:26:24] <synesp> I read that one company had great success by running two separate dbs one for their latest app data and another for historical data
[18:26:27] <cheeser> you could just keep an audit log
[18:28:10] <Nodex> synesp : we do a changelog type scenario
[18:28:51] <synesp> Nodex: how does that work?
[18:28:52] <Nodex> http://www.nodex.co.uk/article/26-05-13/document-record-versioning <--- little background
[18:29:22] <Nodex> we needed it for incremental changes but it's easily adaptable
[18:48:48] <synesp> Nodex: how did you implement this though in your app logic?
[18:51:42] <alkar> @Number6: cheers mate I'll do when I get home - help much appreciated today
[18:53:18] <Number6> alkar: no problem at all - that's what I'm here for
[18:53:39] <Number6> alkar: For my sins, I know far too much about the Linux memory manager :-(
[18:53:54] <alkar> hahahahahah
[19:08:16] <Nodex> synesp : yes, I added something to my wrapper
[19:08:34] <Nodex> you can do it in Mongo but you need to tail the oplog and it's a pain
[19:09:19] <synesp> yeah I'm adding it to my ODM right now we're using
[19:19:32] <tystr> hmm....all my graphs on the mms dashboard are empty
[19:19:33] <tystr> even though the agent logs are OK
[20:40:46] <roger_m> Hi all, any can help me?
[20:41:35] <ron> maybe?
[20:42:06] <roger_m> I made a fork of flask-mongoengine, any can help me with a simple question?
[20:48:07] <roger_m> Any can help me with a doubt about flask-mongoengine???
[20:57:30] <starsinmypockets> Bit of a general / strategic question: I'm coming from sql, where I'd leverage joins with fairly large queries to build up data objects with a lot of fields with associated. Now, working in node with mongoose and mongo I find myself doing a lot of sequential or parallel queries to get the same data. Am I doing it wrong? Stuff like: get the doc, get child elements, get the author, etc etc, then return selected json...
[21:04:29] <cheeser> starsinmypockets: "possibly"
[21:04:40] <cheeser> modeling questions are hard to answer without context
[21:05:07] <cheeser> but if you're always grabbing certain external documents whenever you grab a document, you might consider embedding those docs
[21:05:32] <cheeser> there's a trade off, though. if A "has" a billion "B" documents, embedding is a bad idea
[21:13:36] <starsinmypockets> For instance, I have a set of categories containing embedded subcategories... I need to "loop" through the subcategories and get all associated documents... in sql I would just use joins - in node I'm considering async.each which seems inefficient...
[23:19:53] <edwardbc> hello
[23:19:53] <edwardbc> I have a newbie question
[23:19:59] <edwardbc> how can I empty all fields type array on a collection?
[23:20:10] <edwardbc> I'm trying this: db.users.update({}, {$pull: { contacts:{ $size:0 } } }) but won't work
[23:20:24] <edwardbc> lol wait
[23:20:27] <cheeser> $set : { contacts : [] }
[23:21:09] <edwardbc> db.users.update({}, {$set: { contacts:[] } }) <-- won't work
[23:24:34] <edwardbc> (ignore the first {} param, i was adding it by mistake), yet it won't work
[23:24:43] <cheeser> why not?
[23:24:59] <edwardbc> oh no, that wasn't it.
[23:25:13] <edwardbc> cheeser: db.users.update({}, {$set: { contacts:[] } }) didn't work
[23:28:12] <edwardbc> same way this doesn't work db.users.update({}, {$pull: {contacts: {$not: {$size: 0}} } })
[23:28:18] <cheeser> well, you'll probably want: db.users.update({}, {$set: { contacts:[] } }, {multi: true})
[23:28:45] <edwardbc> boom!
[23:28:49] <edwardbc> there goes my brain
[23:28:58] <edwardbc> that was it!, thanks a lot man haha
[23:29:07] <edwardbc> i knew i was missing something :)
[23:29:17] <cheeser> np. *everybody* gets bitten by the multi flag. ;)
[23:29:26] <cheeser> now i'm off to catch a train.
[23:30:04] <edwardbc> bye