[00:00:03] <Siegfried> don't know if i should post here, i use python to access mongodb, when inserting a large integer i get "MongoDB can only handle up to 8-byte ints" but works with the cli
[00:01:04] <Siegfried> anyway maybe storing as string is the same if we have 128bit ints?
[00:04:05] <vsmatck> Not sure where the problem is. BSON defines 32bit and 64bit signed integer.
[00:04:39] <vsmatck> It'd be less efficient to store a ASCII integer than to store a 128bit int as binary data.
[00:04:55] <vsmatck> bson types here http://bsonspec.org/#/specification
[02:59:25] <krz> my data structure looks like this: https://gist.github.com/3161542 how do I return only the visits that have minute greater than 12 ? is such possible without doing a map reduce?
[05:21:58] <krz> anyone know how to install mongodb 2.2 in homebrew?
[06:52:58] <sunoano> I didn't find any info on whether or not current pymongo (2.2.1) has already TTL support. Also, didn't find any issue regarding this in jira.mongodb.org. Any pointers?
[09:14:01] <NodeX> if you mean multiple then no, you need to loop it
[09:18:52] <omid8bimo> NodeX: what do you mean loop it? i just want to dump just a couple of collections in my database. mongodump -c takes only one collection name! any way around this?
[11:49:50] <ra21vi> I have two collectios. One contains logs with a key, and other collection has users with list of their keys. I have to generate report of logs for each user. Generally, I need to search all logs of given keys for a user. How can I achieve this with map reduce? I am totally clueless on how to proceed. Do map reduce support multi collection?
[11:50:42] <ra21vi> i just have to find the logs of containing any key in users keys array, for a given date range (start date end date)
[11:53:10] <jY> did you read the tutorial on map reduce yet?
[11:55:52] <ra21vi> jY: yes. I tried to understand and corelate with my problem. But I am really clueless.
[12:06:03] <ra21vi> jY: does that stackoerflow solutions really cover multi collections joining. I see that they are just reducing each collections that has similiar data (logs) to another and then re-reduce. In my case i have log and users. Just to make the problem clear, here is what i do to find all logs for a given user: user_collection.find({'key': {'$ne':[]}}, fields={'_id': 1}, partial=True) . this gets
[12:06:04] <ra21vi> me all users which has atleast some items in key array. I have logs with structure: { 'timestamp': ISODate("2012-04-22T20:00:00Z"), 'ops': 'GET', 'bytes': 2047, 'http_status': 200, 'reqid': 'B1CF908AE5' }.
[12:06:47] <ra21vi> oh forgot, the logs also contain 'keyid': 'WSMixayexLR3DHA' .
[12:07:13] <ra21vi> and a user can have multi keys in its keys array.
[12:07:35] <ra21vi> Waht I am trying to find is, all GET ops logs count for all keys of a given user in given timeframe.
[12:08:16] <ra21vi> Right now, I am generating with async tasks, which runs for each users in my program.
[12:08:17] <jY> sounds like your schema is too rational
[12:08:30] <jY> like you are thinking of the problem like you would with mysql
[12:09:19] <ra21vi> jY: yes, but I cannot merge them, Since logs ccomes from different source and I get to sync users keys as well as new users from app db. I don't have control on third party system which sends the log (Amazon S3)
[12:16:41] <neil__g> in mongo m/r, how would i get the number of attributes in a document, this.length?
[12:55:05] <Mortah> Just replaced a secondary with another server (same hostname/port but different IP) and although the replica set confirms it is there we're not seeing mongos send any slave queries to it
[12:55:21] <Mortah> flushRouterConfig isn't forcing it... restarting mongos is
[12:55:30] <Mortah> is there a way to do this without restarting all our mongos processes?
[13:01:36] <_pid> hi, any hints..... I have a schema like this { offers: { "ID_329487123984" : { alias: "offer_1" } } } now I want to query offers.XXXX.alias: "offer_2" .... where XXXX doenT matter..... is it a shit of schema or is it possible? similar query on array aaa.$$.bbb ?
[13:03:39] <Mortah> if the schema was like { offers : [ { id: "329...", alias: "offer_1" } ] }
[13:03:47] <Mortah> then you could do offers.id: 329...
[13:05:38] <_pid> I havenT the key.... I want all documents where for ex. in offers.**.alias: "offer_1"
[13:07:03] <Mortah> I don't think you can do that with that schema... changing offers to a list and moving the ID inside would let you do .find({offers.alias: "offer_1"})
[13:07:06] <_pid> it's crap to query (with one) I think ;-(
[13:07:21] <Mortah> yeah, I don't know how to do it with your current schema..
[13:08:07] <_pid> thank you for confirming it had been feared
[13:23:19] <matubaum> hello, I'm getting this error on ubuntu when i try to start mongo: terminate called after throwing an instance of 'std::runtime_error'. what(): locale::facet::_S_create_c_locale name not valid
[14:04:11] <NodeX> ron : it checks the index and does a sort
[14:04:21] <NodeX> just like any other sort / lookup
[14:04:52] <ra21vi> how big a query size can be.. ??
[14:05:08] <NodeX> iirc a sort is always applied to every result set as it's either $natural : 1 || $natural : -1
[14:05:40] <ron> NodeX: I see. so finding the 'new'est document requires sorting, which means the query will still take nlogn even though it can take n.
[14:06:20] <NodeX> ra21vi : I dont think there is one
[14:06:31] <adamcom> ron: it's still traversing a b-tree one way or the other
[14:06:33] <NodeX> no the newest will be in the query cache
[14:06:47] <NodeX> because the default sort is always $natural
[14:07:19] <NodeX> (provided the query cache has not been emptied [LRU])
[14:07:36] <ron> provided the data hadn't changed.
[14:08:14] <NodeX> the index should still be in memory (if it fits) so will still be fast
[14:08:22] <ron> adamcom: it's been too long from school, I don't remember the specifics of b-trees :-/
[14:08:35] <NodeX> but the result will not be a cached one as it's got to re-sort the b-tree
[14:08:55] <ron> that's assuming there is an index. if not, then yay for us, right?
[14:09:59] <adamcom> heh, ron, don't worry, that's what I use wikipedia for http://en.wikipedia.org/wiki/B-tree
[14:10:15] <NodeX> ra21vi : in addition to my comment - it's not really logical to make a massive query because your data will have to be indexed for it which will make your indexes huge
[14:10:18] <ron> I'm asking hypothetically, by the way. One of my employees was concerned regarding retrieving the 'newest' document and its performance. After I realized he was talking about a capped collection of 100 entries, there was a bit of yelling to go around.
[14:12:49] <NodeX> "natural order: The order in which a database stores documents on disk. Typically this order is the same as the insertion order. Capped collections, among other things, guarantee that insertion order and natural order are identical."
[14:13:20] <ron> I imagine in a non-sharded environment though.
[14:13:20] <adamcom> NodeX: was just about to link that myself
[14:13:34] <adamcom> ron: you can't shard capped collections
[14:13:56] <ron> adamcom: just shows you how much we use sharded envs/capped collections.
[14:13:56] <NodeX> +1 for the new docs.mongodb.org @adamcom .. they're much nicer to read
[14:14:36] <ron> come to think of it, it's probably not the latest document we need, but rather the one that was updated most recently (i.e. using a timestamp field).
[14:14:40] <adamcom> you also want to be careful with updates on capped collections too ron - you can't have an update to grow a document
[14:14:50] <NodeX> Google seems to loe them too... I do my lazy "Mongodb findOne()" into google and it lands your links right near the top
[14:17:37] <ron> NodeX: it's not the same sphinx, you know ;)
[14:18:03] <ron> adamcom: well, for a while there, the first developer here actually configured the driver to ignore errors, so we had no idea why things didn't fail but didn't work too.
[14:18:05] <NodeX> ah, my mistake, I dind't know there was more than one
[14:23:45] <ron> okay, so just to sum things up (even though it makes perfect sense), if I want to find a document with a highest value of a field, it will have to traverse the entire b-tree no matter what in order to find the relevant document?
[14:25:00] <ron> and traversal will be the same no matter which method is used to query the highest value of a field (assuming it's not only sort(...).limit(1))?
[14:31:20] <ron> excellent! so mongo doesn't offer a O(n) method of finding a maximum value no matter what! is that correct?
[14:31:27] <NodeX> a MAX() would be nice but there isn't one .. probably because find({foo : { $gt : 0}}).limit(1).sort({foo:-1}); does the trick
[14:32:21] <ron> it does the trick, though computationally it should be more expensive (unless I'm missing out on something).
[14:33:26] <NodeX> I don't know which one would be less expensive
[14:35:06] <ron> If you need to find the maximum value, you need to go through all the entries one. that is, O(n). if you sort and limit, sorting would be O(nlogn)...
[14:35:19] <Mortah> http://www.colinhowe.co.uk/2012/jul/23/ssds-on-aws--impact-on-conversocial/ <- MongoDB on SSDs on AWS if anyone cares for performance examples :)
[14:37:28] <NodeX> it's not somehting I have ever really looked into, I assume that the reason behind no MAX() is my above query works and perhaps some internals optimise for it?
[14:37:31] <ron> The real question is, though, assuming what you previously said, if the query is used once, then there's a cache of the result and it will be sorted next time? Is that correct?
[14:37:48] <NodeX> MAX/MIN are big things in *SQL so I am assuming it's not an oversight
[14:38:23] <ron> to be clear, there will be a cache of the sort, so it won't be sorted again. correct?
[14:38:54] <NodeX> from my understanding of the docs - yes
[14:39:37] <ron> but if a new document is inserted to the collection, I imagine the sort will happen all over again when the query is executed again.
[14:39:41] <NodeX> what I do know is ... if I do a query and a count with a sort in it then run the same query again, the second query is faster
[14:40:25] <NodeX> I doubt the indexer is smart enough to know the inserted document effected the query cache and add to it - it probably invalidates the cache ^^
[14:42:11] <ron> hmm, that's a bit difficult to know, but in any case, from what you're saying, there no way to execute the query differently, so there's no way to make it more efficient.
[14:55:02] <amitprakash> Hi.. How can I copy all records from collection A to collection B via pymongo?
[14:57:54] <NodeX> http://www.mongodb.org/display/DOCS/cloneCollection+Command <----- execute that
[15:15:52] <amitprakash> NodeX, I dont want to really clone a collection.. I have collection X where I insert records 1 by 1, I have a collection Y which already has records, I want to add all records in bulk from collection X to Y
[15:16:02] <amitprakash> Both collections lie on the same server
[15:16:14] <amitprakash> Collection X is like a transient collection
[15:38:45] <amitprakash> Any ideas how I can do the aforesaid please?
[15:42:09] <jibay> Hi all, I'm looking for a solution which provides a mail server (only to receive e-mails), with an unlimited number of e-mail accounts. One main feature should be that received e-mails can be accessed via an API.
[15:43:10] <Derick> I think you're in the wrong place then? We're a database, nothing to do with email.
[15:50:00] <jibay> Derick: yes but there is a lot of people here :)
[16:18:17] <hdm> random question - when passing in a json { key : value } block, what is the actual syntax for providing a variable for the key? var mkey = "bob"; { mkey : "value" } just uses the key 'mkey'.
[16:22:58] <hdm> looks like obj syntax, instead: var o = {}; o[kname] val;
[17:27:35] <hdm> 72 hour data load with --objcheck, --journal, mongo 2.0.6, still hit those on m/r jobs
[17:27:54] <hdm> maybe a full blown mongoexport/mongoimport json restore to clean it?
[17:28:06] <hdm> db.collection.validate(true) returned all clear
[18:02:24] <tomoyuki28jp> Is there any good example to generate dynamic form?
[19:45:27] <k_89> hey .. i am trying to figure out a use-case for mongodb in a real world app .. i tried googling but all i can find are abstract stuff like big data .. key0value storage etc etc
[19:55:17] <anthroprose> and then I can find all documents across collections that match that guid
[19:55:54] <anthroprose> or run metrics on logs across different guids using mapreduce
[19:56:45] <k_89> gona give mongo a shot tonight to see what its all about .. thanks fir your replies
[19:57:12] <anthroprose> k_90: put some data into an array structure that makes sense to you... and then store that in mongo
[19:57:16] <anthroprose> then you will see the power
[19:57:21] <anthroprose> of working with datatypes
[20:28:23] <dAnjou> hey, i thinking about a little "weekend" project. basically just a tweet-long text with tags. of course i want to search by tags and so on. would mongodb be a good choice or should i stick with sql?
[21:29:01] <b0fh_ua> looks like creation/drop of database is quite expensive operation
[21:43:44] <gustonegro> wereHamster: hmm. that's how I thought it would work...and that is how my data is structured. Not sure why it is not working on my data.
[21:43:58] <gustonegro> Is there any way to tell the type of a field?
[21:44:29] <gustonegro> (e.g. is there away to spit out if my field is an Object or String)??
[22:03:47] <wereHamster> cast them to the correct type before storing in the database
[22:26:08] <gustonegro> how do you search for a field that is stored in an object that is itself stored in an array of objects?
[22:28:34] <gustonegro> I want to do something like this: db.datas.find( { "content.patcher.boxes[*].box.class" : "message" } );
[22:28:44] <gustonegro> where "boxes" is an array of "box" objects
[22:28:51] <doxavore> gustonegro: if you have { things: [ {name:'first'},{name:'second'} ] } ... I believe you can do: find({ 'things.$.name': 'second' })
[22:29:40] <doxavore> or in your case: find({ 'content.patcher.boxes.$.box.class' : 'message' })
[22:43:36] <gustonegro> find({ 'content.patcher.boxes.0.box.class' : 'message' }) works, but find({ 'content.patcher.boxes.$.box.class' : 'message' }) does not work
[22:43:55] <gustonegro> is there a wild card char for searching in arrays?
[22:46:58] <gustonegro> nevermind, I see that you don't even need a wildcard. what I want is: find({ 'content.patcher.boxes.box.class' : 'message' })
[22:47:24] <augustl> are there any web based mongodb GUIs around, phpmyadmin/couchdb futon/etc style?