[02:20:03] <tmchaves> Hi. I have the following collection {book_id:1, commands:[{command:"command_1", sent:false}, {command:"command_2", sent:false}]}. I'm wondering if there is a way to return all commands that have sent == true. I've tried reading the docs but couldn't find it. Anyone could help?
[02:21:42] <spg> afaik you can't grab individual subdocuments
[02:22:25] <spg> that is, you can match against subdocuments within your documents, but you'll get the document back (but you won't get documents that have no matching subdocuments)
[02:24:31] <tmchaves> spg, ok I thought there was a way of specifying a recursive matcher or something, such as find({book_id:1}, {commands:1}) that returns only part of the document
[02:26:10] <tmchaves> is it possible to recover only documents with sent == true? Could you help me doing it? db.commands.find({commands:{sent:true}}) does not work
[02:26:24] <php10> doing an upsert with these two documents ($inc on the sum fields) http://pastebin.com/WJ66u2nC ... identical documents, upsert creates two, why?
[02:34:39] <spg> "we have the aggregation framework now (2.1) which would let you filter parent documents independently of embedded documents and also use include/exclude on the parent document fields while still returning only matched embedded documents."
[08:23:02] <fredix> it seems that I have some buffer overflow with auto_ptr<DBClientCursor> cursor
[08:23:54] <fredix> The BSONObj returned by the cursor is overwriter by a memory allocation
[09:24:30] <fredix> what is the difference between BSONObj::getOwned () and BSONObj::copy () ?
[10:13:39] <kali> fredix: getOwned returns a bson object that owns its buffer. if the bsonobject you call it on already owns its buffer, it will return itself. if not, it will call copy()
[10:14:00] <kali> fredix: it's in bson-inl.g, around line 225
[11:18:50] <mkg_> i have a collection with elements that looks like this: http://pastie.org/4170861
[11:19:24] <mkg_> i would like to change the order of the elements in the "c" array, is there some quick way to do this?
[11:19:48] <mkg_> on the whole collection of course
[11:22:36] <mkg_> "c" represents [latitude, longitude], but mongodb documentation suggests that it's better to have it as [longitude, latitude] so i'd like to change the order on the whole collection
[11:23:53] <mkg_> collection holds around 10^9 elements like this so i'd rather like to avoid reinserting all data again...
[12:49:44] <souza> Hello guys, i'm having another problem with mongoDB and C, i have a collection that have an array of another object, and this object have a date attribute, i have to to get all elements of this collection that this date is after X minutos since now, my code is here >> http://pastebin.com/QvtsuEYq i got a query that works in mongoDB shell, follow it >> db.Service.find({ "vm.last_date" : { $lt : d } }); (d is a date value)
[13:08:37] <souza> Hello guys, i'm having another problem with mongoDB and C, i have a collection that have an array of another object, and this object have a date attribute, i have to to get all elements of this collection that this date is after X minutos since now, my code is here >> http://pastebin.com/QvtsuEYq i got a query that works in mongoDB shell, follow it >> db.Service.find({ "vm.last_date" : { $lt : d } }); (d is a date value)
[13:19:12] <e-dard> Hi all, sorry for the annoying hand-wavey question - I realise there is no solid answer… I'm noticing in my logs that insertions of documents into a collection with about 1.8 million documents are taking between 150 - 300 ms. Is this reasonable? Can I speed this up? The documents are small, avgObjSize is about 215 bytes
[13:20:02] <domeh> what kind of indexes do you have in the collection?
[13:20:44] <domeh> i guess that most of the time would be spent on updating indexes
[13:25:29] <souza> Hello guys, i'm having another problem with mongoDB and C, i have a collection that have an array of another object, and this object have a date attribute, i have to to get all elements of this collection that this date is after X minutos since now, my code is here >> http://pastebin.com/QvtsuEYq i got a query that works in mongoDB shell, follow it >> db.Service.find({ "vm.last_date" : { $lt : d } }); (d is a date value)
[13:25:56] <e-dard> NodeX: is that directed at me?
[14:32:40] <Deathspike> NodeX: Picking up where we left of, say I do something alone the lines of "db.messages.find({ user_id: { $in : [1,2,3,4,5] }});" to retrieve messages of my friends, could I also get additional information if a message contains a "page_id" to retrieve the related page information (i.e. <ThisGuy> liked <thispage>)?
[14:36:12] <NodeX> that will get you any user of 1 or 2 or 3 with a like or a message
[14:36:21] <NodeX> basically anything your friends have done
[14:38:02] <NodeX> the beauty of schemaless is you can store as many or as less fields as you want... a "type=message" could have 6 extra fields where a link might have a link
[14:38:11] <NodeX> just parse it out in your app ;)
[14:38:57] <Deathspike> NodeX: Ok I understand that, but they liked *something*, I'm concerned about getting that reference. I.e. I could save the page you liked in the action collection as a simple name and id to link to, but what if the page name changed? Then the like would be referencing something that became incorrect.
[14:39:19] <NodeX> how often do page names change ?
[14:39:33] <NodeX> this is what you have to consider .. speed v's flexability
[14:39:37] <Deathspike> Not very often, but when something can go wrong, it most certainly will.
[14:40:11] <NodeX> so on those "not very often" times it's easy to do a quick query to update everything where page_id = foo
[14:40:33] <Deathspike> Oh god I feel like a retard now.
[14:41:01] <NodeX> nosql vs RDMBS is a very different way of thinking
[14:41:11] <Deathspike> Still, duplicating data is something, coming from relation db, that I frown upon. It's faster, hands down, but it feels jicky. Isn't that an excessive waste of data?
[14:41:16] <Deathspike> Yeah, I must learn that :)
[14:41:17] <NodeX> there is no relations and no joins so you have to model very differently
[14:41:37] <NodeX> but how much data is it really ?
[14:41:56] <NodeX> you dont waste it in RDBMS for the following reasons ... scaling is a nightmare and performance is key
[14:42:10] <NodeX> on mongo scaling is easy and it's still fast
[14:42:36] <Deathspike> Not that I'll ever get to a scale like FB or Twitter, but let's say a each page is liked a few thousand times. That's quite a bit of excessive data, but that is something that is usually neglected in favor of speed and scalability in the nosql-community then?
[14:42:39] <NodeX> if your data model doesn't suit these sparse updates then model it differentlt
[14:43:51] <NodeX> It's the way I do things personaly because it keeps my app fast and scalable
[14:44:23] <NodeX> considering there is no joins and the alternatives are often slower methods i would hazrad that's how most people tackle this in nosql
[14:45:53] <Deathspike> Ok, I'll have to get used to this. I think I can model my entire app with these suggestions, except for one lingering question: searching. Let's say I want categories for pages and want to allow searching on categories and keywords in titles, how would I do about this?
[14:46:20] <Deathspike> I think I can make a collection with genres and add page_id's in it, and search through that to match pages quickly, but title searching?
[14:46:24] <NodeX> what kind of seraching are we talking
[14:47:06] <Deathspike> ~14k entries with a title between 10 and 100 characters.
[14:47:28] <NodeX> mongo uses regex but it's not very efficient without a prefix
[14:48:02] <NodeX> what a few people do (including me in the past) is lowercase and split on space the string and store it in a separate field, index that and use it for regex
[14:49:09] <Deathspike> You mentioned that's in the past, what has your current preference?
[14:49:42] <NodeX> I use SOLR nowerdays for keyword searching becuase that's what it's designed for
[14:49:59] <NodeX> I use mongo for basic seraching (ranges, $in etc)
[14:52:59] <NodeX> SOLR is a great acompanyment to Mongo
[14:53:25] <NodeX> but it's an index and as such is volitile so you need a datastore on the end of it
[14:53:28] <Deathspike> How do you manage title/id relations between the two platforms?
[14:54:00] <NodeX> SOLR has a concept of uniqueKey ... I just set it to the Mongo ObjectId() ... (similar to Primary key in SQL
[15:01:52] <Deathspike> NodeX: I just read a little about Lucene; it seems this is more focused on "just searching" and not on extra features, so this could be used instead of SOLR?
[15:02:45] <Deathspike> I'm referring to this, http://lucene.apache.org/core/
[15:03:22] <NodeX> me too, SOLR is a "lucene" based search engine
[15:03:32] <NodeX> just like node.js is a javascript based whatever it is
[15:04:00] <NodeX> not strictly true but similar...
[15:04:17] <NodeX> mysql is an SQL based RDBMS ... MSSql is an SQL based RDBMS
[15:05:07] <NodeX> there are other lucene based things .. elastic search for one .. a little green in my opinion at the moment but it's gaining traction
[15:06:37] <horseT> hi, I need to do that using php driver : db.ug.distinct( "u" , { "done":1, ddd:{$gt:20120103, $lt:20120105} } ).length
[15:06:52] <Deathspike> I don't need a whooping lot of features, just search. Well point being, I should be able to model everything and get a search provider up and running someone. It's going to be quite complex compared to single deploy stuff I used to work on, tho.
[15:07:04] <horseT> but no idea how to set the ".length" :(
[15:07:45] <NodeX> Deathspike : you wont be able to mix SOLR and Mongo queries unfortunately
[16:54:28] <sir_> am i correct in saying that the file data in gridfs chunks is the pure binary and can be searched against as such, and it's only base64 encoded when displaying results from the shell?
[17:56:48] <lahwran> what would be the most efficient way to store a versioned wiki?
[17:57:20] <lahwran> I'm currently thinking a multidimensional sparse mapping
[17:57:29] <lahwran> ie, documents which have a "key" and a "value"
[17:57:51] <lahwran> the key would be a subdocument, where each sub-key represents the location along a dimension
[17:59:15] <anrope> Hi, I'm trying to create an index to expire data from a collection. I'm working in python, using pymongo. Does the 'ttl' keyword argument to Collection.{create,ensure}_index() correspond to the expireAfterSeconds paramater mentioned in the mongo docs here: http://docs.mongodb.org/manual/tutorial/expire-data/ ? The documentation makes it sound like the pymongo ttl parameter just controls how long ensure_index waits to do a creat
[18:15:54] <mmlac> Does anyone here have a deep understanding of MONGOID?
[18:20:27] <FerchoDB> A customer has an array of orders. Each order has an array of OrderDetails, and each OrderDetail has a prodId. is there a way, server side, to query all the customers that have an order with an orderdetail with a prodId = 2 ?
[18:20:45] <FerchoDB> I'm sorry, corrected the question and missed the Hello!
[18:53:25] <dogn_one> is it possible to set the key value at inserting time? please
[18:54:20] <sir_> dogn_one: can you clarify? when you insert, you are setting all the key/value pairs you want in the document
[18:55:00] <dogn_one> sir_ sorry im talking about some index
[18:59:46] <tomlikestorock> in pymongo, where can I import ObjectId to create my own object id classes?I can't seem to find it...
[18:59:51] <dogn_one> happens i have a mess here, i have documents with a reference, some hexadecimal value, which maps to an integer. I want to avoid to order the documents, then calculate where it starts.
[18:59:57] <tomlikestorock> I'm trying to query by objectid
[19:00:17] <sir_> tomlikestorock: which version of pymongo?
[19:05:45] <patroy> Hi, I have a mongo database and I can't seem to be able to update records that are in a sharded collection. Any reason why that would happen? They insert fine, but I can't update
[19:07:45] <sir_> not familiar with the ruby driver, but you can try changing the update query to a find, and see how many documents are being returned
[19:07:55] <sir_> to ensure it's finding what you expect before updating
[19:08:27] <sir_> i know with sharded collections, if you don't use the _id, you need to include the shard key
[19:58:47] <elarson> I believe i have a race condition in my code and I'm curious how to help avoid it
[19:59:59] <elarson> it seems that we are trying to call update() with upsert=True and I think the pymongo client tries to update, fails, the second writer slips in, causing the next insert to fail...
[20:00:24] <elarson> that probably doesn't really make sense though... sorry for thinking aloud a sec
[20:55:47] <littlen_1> sir_ this is what i was looking for collection.ensureIndex(new BasicDBObject("reference", 1), new BasicDBObject("unique", true)); I Guess, thanks
[21:05:06] <Goopyo> Q: Mongodb site insists that sharding is the primary method of speeding things up, but what if data size isnt your issue but the amount of traffic is. Surely replication with slave reads would give you a substantial boost?
[21:10:23] <sir_> replication with read_preference being set to SECONDARY gives you your processing power, however
[21:10:42] <sir_> by sharding you have simultaneous lookups on sections of your data being returned to you
[21:25:45] <WormDrink> If I disable journal - can I loose more than 60 seconds of data ?
[21:27:04] <JoeyJoeJo> I can find all documents in a geographical rectangle using $box. Now how can I narrow my results to everything within the rectange where a = 1?
[21:32:00] <halcyon918> is there a reason why when I call {getLastError : 1, w : "veryImportant"} (which we have configured as tags on our replicas), the call just hangs (I'm in the mongo CLI testing this call)
[21:37:31] <WormDrink> If I disable journal - can I loose more than 60 seconds of data ?
[21:46:45] <dgottlieb> halcyon918: can you confirm the previous insert/update command actually replicated to all the members in veryImportant?
[21:47:02] <mmlac> Does anyone here have a deep understanding of MONGOID?
[21:47:37] <dgottlieb> halcyon918: you can also add a wtimeout: <time in milliseconds> to avoid the hang
[21:47:38] <halcyon918> dgottlieb: I'm just reading up on this… I didn't know it was a blocking call (but totally makes sense to me now)...
[21:47:55] <halcyon918> dgottlieb: I'll query each replica individually
[21:50:56] <cgriego> If I wanted to download the 2.0.2 deb package, where would I find that?
[21:52:05] <dgottlieb> WormDrink: your question is a little vague, but in general I would say you can't. Then again I wouldn't exactly bet everything on that answer :)
[21:56:25] <halcyon918> dgottlieb: I see the insert replicate to all three instances, but getLastError still hangs
[21:59:28] <rockets> Anybody ever run MySQL and MongoDB on the same server? Did you have any issues or weirdness?
[21:59:51] <halcyon918> w:'majority' seems to work, but w:'<my tag>' hangs
[22:00:29] <linsys> Rockets: now why would you ever want to do that
[22:00:42] <rockets> linsys: we have legacy stuff that depends on MySQL
[22:01:43] <rockets> It was just a thought. I'll leave things the way they are.
[22:15:09] <JoeyJoeJo> Is it possible to search within search results?
[22:18:57] <dgottlieb> halcyon918: interesting, you sure the tag is setup right? no additional fourth server that's down? I haven't played with customizing tags before
[22:19:33] <halcyon918> dgottlieb: am I'm sure it's setup right? I have a DBA… so no :P
[22:20:00] <halcyon918> dgottlieb: at the moment, I'm ok with Majority + Journal Safe
[22:24:27] <dgottlieb> halcyon918: hah! well I'd love to hear anything weird you find if you do wind up poking more at it