[03:00:02] <bmillham> I'm looking for help with a query that orders on an embedded documents field.
[03:01:12] <bmillham> It mostly works, but if the embedded document (a list of embedded documents) has more than one element, I'm not getting the expected results.
[03:28:00] <s2013> anyone knows best way to get randomd ocuments from the last 5 days
[03:28:22] <s2013> i can do a count for all the docs in the last 5 days
[03:28:29] <s2013> and then do a random skip or something. but is there something more efficient
[03:40:53] <Boomtime> s2013: there is no built-in mechanism to select something at random, your idea to skip() will work well enough for many cases - though ensure it is indexed and sorted at least
[03:41:15] <s2013> hmm alright. but if i need to get multiple documents do i need to make multiple queries
[03:42:39] <s2013> also is there anyway to backup and restorte db remotely from one remote db to another?
[03:46:01] <s2013> Boomtime, i see. what do you mean by more involved solution
[03:46:09] <s2013> you mean like change the schema a bit or something
[03:46:42] <Boomtime> s2013: by "backup" you describe a copy-database scenario like this I think: http://docs.mongodb.org/manual/reference/method/db.copyDatabase/
[03:47:28] <s2013> Boomtime, yeah let me check that out. thakns
[03:48:04] <aniasis> mongodb is basically a flat table?
[07:39:24] <Soothsayer_> Is it good practice to perform delete queries WITHOUT an index on a collection of > 100,000 documents? (Its a non-user facing query and runs in the background at various events)
[07:43:50] <joannac> Soothsayer_: deleting how? by _id?
[07:44:22] <Soothsayer_> joannac: no, by passing a query on multiple fields (3 to be precise)
[07:46:13] <joannac> how long does the query take to match?
[07:47:55] <Soothsayer_> in my development environment.. 35 ms +
[07:48:42] <Soothsayer_> joannac: but this is at a data set of 25,000 documents
[07:49:01] <joannac> if you can afford the hit, sure
[07:49:16] <joannac> is the index just not useful enough?
[07:49:43] <joannac> why don't you have an index if you're doing this semi-regularly)
[07:49:53] <Soothsayer_> joannac: sorry, takes around 15 ms on remove for a 25,000 record set. (I was doing a sort earlier too).
[07:50:24] <Soothsayer_> joannac: yes, the index is not useful anywhere else.. just for this particular query.
[07:50:55] <Soothsayer_> joannac: hmm true, its more than semi-regularl.. its pretty regular (whenever a Product gets deleted / disabled, this document has to be removed).
[07:51:04] <Soothsayer_> So it can be at any point, anything.
[12:43:10] <sabrehagen> hi guys, i'm getting some really weird results from mongodb. i have an object in my database, and inspecting it via robomongo, it shows one set of data, but querying the document via mongoose, some fields are empty. when i update the fields of the document that do show, they change, so i know i'm looking at the right document. what might be happening? (see "team":[]) http://i.imgur.com/Z9iEL9J.png
[12:51:48] <sabrehagen> mongoose model configured incorrectly
[13:18:00] <roadrunneratwast> I am starting with Mongoose. Is it possible to define a subtype without making a document? I am developing a taxononmy, each taxon of which has an image, a name, and a comment. I don't think I want to store the taxons in a separte document (do I?) but I would like each node of the taxonomy to have one of these
[14:06:06] <kephu> I was wondering, are there any tools to reduce the ridiculous verbosity of aggregate pipeline's statements?
[14:20:18] <cihangir> hi all, i have an int property and i want to update that field, i can also use $inc. which one is faster $set or $inc?
[14:20:54] <kephu> my money would be on $inc, but I didn't run any benchmarks on it or anything
[14:21:18] <Derick> cihangir: how would you be using the $set ?
[14:21:32] <Derick> don't do a "find document, calculate new value, set with $set"...
[14:21:59] <cihangir> i know the old value before hand
[14:22:54] <Derick> I'd go with $inc, but I bet the speed difference is negliable
[15:01:36] <Constg> Hello, I need your help, as I'm going mad.
[15:02:03] <Constg> I do an update: var test = db.users_done.update(...)
[15:02:45] <Constg> all is ok in the database, and if I do print(test), then I see nUpserted:1, _id:ObjectId("...")
[15:02:59] <Constg> And...impossible to get this _id except with print
[15:03:20] <Constg> test.nUpserted display 1, but test._id display undefined.
[15:03:38] <Constg> Do you have any idea why, and then how to get the last inserted id?
[15:03:53] <Constg> I use no client from language, I'm in Mongo shell
[15:04:09] <blaubarschbube> hi. we are facing a problem while replication. it hangs on moving a chunk. error message is "InsertDocument :: caused by :: 17280 Btree::insert: key too large to index". how can we get rid of that? is it possible to just skip error causing chunks?
[15:04:39] <Constg> blaubarschbube, do you use GridFS?
[15:11:40] <blaubarschbube> so theres no possibility to skip or ignore that chunk?
[15:27:23] <elcha> how can I check that mongo is running with the recommended ulimit values? I noticed that there is an upstart script mongod.conf in my /etc/init with the 64000 open files stanza etc however if I cat the proc id limits for mongod I see the default 1024 open files setting
[15:33:00] <Constg> Ok, I ask again, from Mongo shell, in javascript, how to get the last inserted _id please? If I print() my insert, I can see a Writeresult object including the _id, but I can't access it.
[15:33:47] <kali> Constg: you can't. but what you can do is set the _id yourself (the driver just do new ObjectId()) just before inserting
[15:35:58] <Constg> kali, is it as fast as not specifying one? If yes, how could I manage to create the id with update and upsert = true? With a new _id, the object will never be the same and it will result to insert a new object everytime. Or, could I specify the _id in $setOnInsert only?
[15:37:22] <kali> Constg: for the performance, it should be about the same, for the upsert... $setOnInsert may work, you need to try it
[15:44:08] <Constg> Arg, in fact, no, it doesn't help.... As what I want is to get the _id even if it was not an upsert, but an update too... :-|
[15:44:54] <Constg> I'm afraid I'll have to query the collection for the _id
[15:45:09] <kali> Constg: would it not make sense to use "your" unique key as the _id ?
[15:46:57] <Constg> no, in fact here is what I do: I create an entry in collection1, and updating another collection: collection2. If collection2 does know the user in collection1, then it will create it, and return the _id, to insert it in collection1 to make the link between both. In this case, create the objectid before works.
[15:47:26] <Constg> but if the user already exsits in collection2, then how do I link my new entry in collection1 to the same user
[15:49:41] <Constg> kali, I'm not sure I'm clear...do you understand? ^^
[15:57:05] <Constg> but I'll do that in another way: find(old_id) if not exists, then I create a new _id
[15:57:08] <appledash> I'm running mongodb on a server with very low storage space... I found about the storage.smallFiles option to avoid creating massive journal files
[15:57:12] <appledash> But how do I SET that option
[15:57:23] <appledash> adding storage.smallFiles=true in mongodb.conf says unknown option
[15:57:53] <Constg> You helped me to find that I'm in the wrong way ;)
[16:40:43] <Sengoku> Hey "org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'solrCrudRepository': Invocation of init method failed; nested exception is org.springframework.data.mapping.model.MappingException: Could not lookup mapping metadata for domain class java.lang.Object!
[17:00:32] <aliasc> So its me again with my game trying to decide whether mongodb is the perfect choice to handle gaming overheads
[17:00:38] <GothAlice> Another ticket closed, 'tis a good day.
[17:01:38] <GothAlice> aliasc: All I can say is that yes, MongoDB is capable of supporting Facebook-game style loads (via a standard webapp interface) with up to a million simultaneous users in our own testing. (We didn't bother testing higher… we were going to cross that bridge when we got to it.)
[17:02:31] <GothAlice> http://www.slideshare.net/montrealpython/mp24-the-bachelor-a-facebook-game was our presentation on our architecture.
[17:04:34] <aliasc> Well. The game will store object data and this can get cpu sensitive.
[17:05:07] <aliasc> Its coop like game where two or more players advance the game and the progress, including game objects positions etc are stored in coordinates
[17:06:40] <GothAlice> You said you implemented client/server. What're the format of the "packets" being sent between them? JSON strings, Bencode binary coding, Google Protobuf, packed C structs, …
[17:07:41] <GothAlice> That sounds badly engineered. Internally it'll still have to explode out the packed values…
[17:08:06] <aliasc> True. The data are further processed by the client.
[17:08:28] <GothAlice> And you can entirely avoid serialization/deserialization if you use BSON as your wire protocol. You'd be able to retrieve from MongoDB, then need to do *nothing* to the data before passing it off to the client. :)
[17:10:24] <GothAlice> http://bsonspec.org/spec.html — it's a very simple and clean specification, and there are library implementations for almost everything: http://bsonspec.org/implementations.html
[17:10:52] <GothAlice> (With an appropriate offset into the data, you could use the floats straight out of the packed data!)
[17:11:15] <aliasc> Well currently the client gets a json string explodes the coordinates does the conversion then the rendering
[17:11:58] <GothAlice> In the way suggested above your client would get the BSON blob, find the indexes to the coordinates, and pass those pointers off to the rendering.
[17:12:07] <aliasc> Its seems like i need to stop thinking of MongoDB as a web technology only.
[17:12:16] <aliasc> Im used with PHP NodeJS and web interfaces
[17:12:26] <GothAlice> aliasc: I use it as a massive-scale FUSE filesystem. (25 TiB of data in GridFS and counting…)
[17:13:33] <aliasc> the problem is, i was working on web technologies and RDMBS for the most part. I cant avoid thinking of it as just a web interface.
[17:13:56] <aliasc> This is why i engineered the client as a browser asking for a specific string.
[17:14:04] <GothAlice> Except MongoDB is clients/server, speaking BSON, just like your own client/server isn't "web".
[17:14:33] <GothAlice> (However all of these are basically identical: they're stateful transactional request/response cycles.)
[17:14:50] <aliasc> Yea true. And i love MongoDB. I dont want to be dissapointed. Its scalable, easy to implement and fast.
[17:20:36] <GothAlice> aliasc: My mantra is usually "help MongoDB help you make cleaner apps" — storing "x:1,y:2" as a string instead of a native structure is indicative of picking the wrong fight.
[17:20:38] <GothAlice> With the result of: client packs string, client bundles in JSON, client fires over the wire, server receives, unpacks JSON, packs BSON, sends the wrapped string to MongoDB. Subsequently MongoDB sends the BSON to the server, the server unpacks, re-packs into JSON, sends to a client, the client receives, unpacks the JSON, unpacks the string, then does something with the numbers.
[17:21:08] <GothAlice> And should make anyone go 'wat'.
[17:23:34] <cheeser> i worked at a company that decided early on to store a certain json document as a string and regretted ever since.
[17:24:51] <omid8bimo> hello, i need help, im keep getting this errors on my 2 new secondary servers and they are falling behind primary more every minute that passes by
[17:26:22] <GothAlice> omid8bimo: Because I can't be sure that I received all of that (and I likely didn't; I've got paranoid flood protection enabled), could you try that again gist'ed?
[17:34:57] <GothAlice> The significance of using a gem?
[17:36:17] <GothAlice> I.e. regardless of language or driver, querying "fs.files"—the metadata collection—will let you "find a file". http://docs.mongodb.org/manual/reference/gridfs/#the-files-collection < It has a pretty simple structure.
[17:37:57] <GothAlice> omid8bimo: http://docs.mongodb.org/manual/faq/diagnostics/#does-tcp-keepalive-time-affect-sharded-clusters-and-replica-sets < This may help.
[17:39:43] <aliasc> can you explain whats the best approach i should take
[17:40:03] <GothAlice> aliasc: Is your client written in C or C++? (I.e. do you have easy access to pointers?)
[17:41:12] <GothAlice> Well, even if you don't… using BSON as your own application's wire format will allow you to easily pass along data from MongoDB without requiring any additional work on the part of your server. This is a huge plus. The format is relatively compact (not as compact as packed C structs, but it's still pretty good) and explicitly has lengths on everything, so it's easy to stream.
[17:43:02] <GothAlice> And certainly, avoiding extra conversions (floats->string->json->string->bson->string->json->string->floats) is a must. This would also allow you to perform statistical analysis using MongoDB, as it'd understand that those are actually numbers. It can work with numbers. :)
[17:44:01] <GothAlice> And it's way more packed than JSON for most data. (Numbers esp., since in JSON they're variable length strings.)
[17:59:29] <GothAlice> aliasc: An example packed record containing a player ID and coords: 4_byte_length + "\x10p\x00" + 4_byte_player_id + "\x01x\x00" + 8_byte_x_coord_float + "\x01y\x00" + 8_byte_y_coord_float + "\x00" = 34 bytes. 38 if you need 64-bit player IDs.
[18:00:16] <GothAlice> In hex: 22000000107000abe72900017800000000000000f03f017900000000000000004000 — player 2746283, x 1.0, y 2.0
[18:02:26] <drags> Hello friends, if I'm writing a mongo shell script (as in, to be executed as `mongo myScript.js`), is there something akin to runCommand that accepts queries writen as strings?
[18:02:42] <drags> I have an array of queries stored as strings which I want to run, not sure how to execute them
[18:04:04] <GothAlice> drags: I'm not sure how you're storing them as strings, considering queries are formed from rich, potentially deep structures.
[18:04:20] <GothAlice> drags: If you have them JSON encoded, drop the surrounding quotes and bam, that's not a string that's real data now.
[18:07:08] <GothAlice> drags: If they're JSON and you simply want to evaluate them as-needed, run the string through JSON.parse() before passing as the query.
[18:09:19] <drags> GothAlice: ah, so I should say, not just queries, but the whole query line including the "db.<collection>" component, which changes from query to query
[18:09:41] <GothAlice> That… is terrifyingly gross. Eval would be your only "real" option, there.
[18:09:45] <drags> is that something I can store and then re-call&execute?
[18:10:10] <GothAlice> db[collectionvar] works just as well as db.foo (where collectionvar = "foo")
[18:10:13] <drags> GothAlice: yeah it's intended for a quick internal reporting thing, but I'm thinking it might not be appropriate to do in this manner
[18:10:34] <drags> excellent, that gets me a little closer
[18:10:49] <drags> I'll play around with this and see what I can do. Thanks GothAlice :)
[18:11:00] <GothAlice> One of the few features I actually like in JS: Objects can be accessed via array notation, not just attribute access.
[18:31:07] <wroathe> Question: while reading through the mongo documentation I noticed the following in the FAQs: "MongoDB automatically uses all free memory on the machine as its cache." Do you guys think an argument could be made that using memcached alongside mongodb is overkill, even for large, distributed, applications?
[18:31:37] <wroathe> That is, if your data has been properly normalized (when appropriate) or embedded (when appropriate)
[18:32:27] <GothAlice> wroathe: We may be a little unusual at work, but we use MongoDB as a cache (using TTL indexes), session storage, near-zero-latency queue (capped collections and tailing cursors), as well as general data store.
[18:33:38] <wroathe> That's what I was thinking. It seems like MongoDB would work very effectively as an alternative to memcached.
[18:34:02] <GothAlice> One worthy note: TTL indexes are minute-accurate.
[19:00:55] <jonasliljestrand> Hey guys, are there any limits to db.colleciton.insert ? Like how many documents
[19:43:52] <cheeser> for checking things like documents update/written.
[21:03:11] <spenguin> should "mongodb://localhost:27017/test" be sufficient to specify a mongodb connection at localhost on port 27017 using the "test" database?
[21:03:58] <cek> Do I understand correctly mongodb can hold big tables and that bson doc restriction of 16MB is only for one "record", not for the whole "database"?
[21:06:41] <spenguin> mongodb has its own query syntax
[21:07:09] <spenguin> (and I'm not too familiar with SQL queries)
[21:07:45] <spenguin> cek: are you trying to do fuzzy matching?
[21:14:46] <BigOrangeSU> Hi all, does anyone if its possible to generate a mongo object id for a given ts? I am trying to back populate some records?
[21:18:07] <cheeser> depends on the driver, i'd expect. you can pass in a Date object using java
[21:20:42] <BigOrangeSU> do you know if its possible in pymongo? I don't think there is anything in the doc, http://api.mongodb.org/python/current/api/bson/objectid.html
[21:22:53] <joannac> BigOrangeSU: what's the use case?
[21:23:34] <joannac> there is a very clear example of how to do it on that page, as well as a big warning that doing so is not safe for insertion since you lose the uniqueness guarantee
[21:24:09] <BigOrangeSU> joannac: Long story, but I ETL mongo db to sql then i use aggregation and select on min(_id). These are records of tracking a user on our site, so I am getting the first record for a given session. However I have to backpopulate some data
[21:24:43] <BigOrangeSU> joannac: thanks yea i see it (though I wont use it)
[21:28:50] <joannac> BigOrangeSU: how many documents do you need to populate?
[21:29:31] <BigOrangeSU> i think im going to just find another way to do it
[21:30:46] <joannac> yeah, i don't have a good solution for you
[21:31:27] <joannac> BigOrangeSU: you have these original times elsewhere in your document, right? why not sort on those?
[21:38:37] <BigOrangeSU> joannac: i do, its just that I have so many queries I would have to update
[21:39:15] <BigOrangeSU> joannac: looking back I was trying to save myself an extra join and simplify the queries by doing min(_id),
[21:39:44] <BigOrangeSU> rather then doing a join back to get columns outside of an aggregate if that makes any sense
[21:59:21] <spenguin> joannac: should "mongodb://localhost/test" be sufficient to specify a mongodb connection at localhost on port 27017 using the "test" database, or do you have to supply credentials before specifying a different database?
[23:01:24] <blizzow> How can I find the most recently updated/inserted document in a collection?
[23:02:15] <Boomtime> you have a field in your schema called "last_modified" or some such
[23:03:54] <Boomtime> for example, when you do an update or insert op, add { $currentDate: { last_modified: true } }
[23:04:14] <Boomtime> (check my syntax i didn't test it, but it should be something like that)