[00:17:15] <tystr> I know you can use regex to query,
[00:17:31] <tystr> but is it possible to use capture groups to do an update find/replace ?
[00:18:23] <tystr> e.g. I need to replace part of a string value in a bunch of documents
[00:37:00] <jiffe1> is https://github.com/mikejs/gridfs-fuse/ still the best implementation of a fuse client for gridfs?
[00:50:30] <Honeyman> Hello. Need some hints regarding map/reduce performance and RT usage.
[00:50:37] <Honeyman> 1. As far as I understand, there are limitations in map/reduce which make it generally unusable for "real time" queries. That is, it is generally more intended as, say, an administration tool, rather than a system that can be called on every user request, or even once in a minute/several minutes. Please correct me if I am wrong.
[00:50:58] <Honeyman> 2. The main reason of this is the limitation that only 1 thread of map/reduce can be run at a time (not speaking about sharding, etc). The usual "key-value" queries are not limited with it. Please correct me, again (say, if there is some more important factor than that).
[00:54:57] <Honeyman> 3. Are these issues limiting only, say, the db.col.mapReduce() function, or are there any other subsystems which share the map/reduce internals and hence also should be assumed "non-realtime-safe"? db.col.count(), db.col.distinct(), db.col.group()? Does the aggregation framework from 2.2 has the same limitations as mapReduce()?
[00:55:02] <Honeyman> 4. Any known roadmap when the mapReduce can be considered "generally ready for realtime"?
[00:57:12] <Honeyman> In particular, isn't SERVER-4258 expected to solve precisely this issue?
[01:54:54] <Guest70203> why would a database created from within an app not be accessible when using the mongo client?
[01:55:52] <Guest70203> i can still use the app to access the db and various db related files can be seen in /data/db. but it is as if the db doesn't exist when i connect to mongod with the mongo client
[05:17:29] <pr0ton> btw it seems like mongodb doesn't use all the memory that the box has to offer
[05:17:33] <pr0ton> i want it to use as much as it can
[08:13:55] <mids> asyncmongo is an asynchronous python monogdb lib by bit.ly
[08:14:28] <mids> but yeah, only using it with tornado, not twisted
[08:14:37] <mids> so what is up with txmongo? what error do you get?
[08:15:23] <trupheenix> mids, i cannot figure out how to do a find() like in pymongo where i can do a find({"id":"1234"}) to get the entire document associated with that id
[08:16:13] <mids> trupheenix: can you pastebin your current code?
[08:20:06] <trupheenix> mids, i just got disconnected did i miss any other messages from you?
[08:20:24] <mids> if you query for an object id; you most likely have to use pymongo.objectid.ObjectId("4ffd3b65140f5669b3f35f13") (or bson.objectid in newer versions of pymongo)
[08:20:51] <trupheenix> mids, i tried using ObjectId. it didn't work. it gave error
[08:21:21] <mids> now to query; just use: apps.find({"_id": blabla})
[10:25:51] <naiquevin> hi, I am connecting to mongodb from a django app using pymongo library. I create an instance of the connection object as a module-global due to which the connection leaves open between requests ie the mongod logs just shows "connection accepted". How bad it is from mongodb point of view to have an open connection like this or should I be connecting on a per request basis ?
[10:29:52] <unknown_had> Hey all how i may get ID of last document added/last insert id.
[10:55:58] <NodeX> anyone alive who knows the gridfs part fo the php driver alive
[10:58:10] <NodeX> the docs are sparse on whether one needs to close the connection after getBytes()
[12:06:05] <W0rmDrink> but why are you moving to CouchDB ?
[12:06:28] <remonvv> I don't get why people get all excited about eclipses and whatnot but when ObjectIds switch from 4 to 5, which is a once in every 8 years event, everybody's like "so..wait..is that like the Y2K bug?"
[15:12:38] <warz> hi all. how would i query the last 10 records, for example, out of a collection, but still keep them in the natural sorting order, which is ascending for an _id field, i believe? i think i'd need to skip documents, but how would you dynamically skip the correct amount of documents?
[15:14:15] <warz> or does limit work from the end of the cursor?
[15:22:46] <warz> i understand that part. i think what i need to do is something like this, though: db.collection.find().skip(db.collection.count() - 10)
[15:39:26] <diegok> hello. Which is the best way to replace a collection?. I mean, I have a collection with 30M docs and a dump with different 7M. I want to switch the faster I can. I was about to test renaming the existing one and then restore the dump... it's ok?, is there a better way?
[15:43:27] <markgm> NodeX: I am looking to store documents with a few known fields, however, the rest will be determine by a client admin and will fluctuate from subdomian. So I am looking to use persistent collections in Doctrine with a flexible schema. Is there any way to do this? Or would it be better to rely on Mongo's driver
[15:43:45] <diegok> NodeX: I'll tell you in a moment. Thanx!
[15:43:59] <NodeX> I wasn't aware that drivers were forcing schemas in a schema free datastore to be honest
[15:44:07] <NodeX> seems kinda against the grain if you ask me
[15:44:34] <markgm> that's what I have been thinking to
[15:44:35] <sinisa> i got invalid operator $unwind
[15:45:10] <NodeX> markgm : I wrote my own wrapper to go around the driver to make light of most things.. took around a day but it gives greate flexibility
[15:46:05] <NodeX> yes it's a bit sad that they force these things
[15:46:24] <markgm> NodeX: Yes, I have pretty much come to the conclusion that I would have to write my own. But I didn't want to waste the time if there was one out there, so I thought I'd come here first
[15:46:25] <NodeX> perhaps as Bofu2U said there is an override?
[15:48:11] <Bofu2U> Random Q for whoever may know: is it possible to take a record, divide a numeric value on each by a constant, and then sort the output by the result?
[15:48:21] <markgm> Yes, I'll make do. It's not like I have anything else to look at haha.
[15:48:24] <Bofu2U> as in... if there were 100,000 records matching the query.
[15:49:07] <sinisa> bofu .. looks like Redis job :)
[15:49:18] <remonvv> Bofu2U, no, $inc is the only in-place mathematical operator currently supported
[15:50:10] <Bofu2U> Back to SQL for that query for me then, heh.
[15:50:25] <NodeX> markgm : i'll put some usage examples at the bottom ..
[15:50:44] <markgm> NodeX: Cool, that would be very helpful
[15:51:03] <sinisa> i have also "back to sql" situation :)
[15:51:35] <Bofu2U> sinisa, yeah. I have north of 100,000,000 in a collection and need to sort by "relevance to now" aka number divided by timestamp, etc.
[15:52:31] <sinisa> i have alot products that needs some JOIN and also some calculations
[15:53:11] <sinisa> ofc, cant be done with mr, couse that will be unhuman :)
[15:53:43] <dob_> How can I sort ignoring the case?
[16:08:30] <sapht> and the documentation just says to pass "safe=True" to save/update and that options propagate to the "resultant getLastError command"
[16:08:38] <sapht> yeah, but where's that command? i can't find it documented T_T
[16:10:31] <sapht> ah... db.command({"getlasterror":1}) it would seem, let's hope i haven't got this backwards
[16:21:00] <diegok> NodeX: ok, rename can move from one db namespace to another and it also move/rebuild indexes. But it takes long. Renaming on the same db is instantaneous.
[16:21:07] <TheEmpath> hi… if mongo 2.0.6 is fsynching, and another connection tries to write, what does mongo do to the other connection?
[16:22:34] <groundnuty> hey, is it possible to use mongodb without runnign a server? - similar to sqllite? if not you gyus know any nosql dbs like that?
[17:02:13] <TecnoBrat> the question is going to be how does it perform on 110,981,462 documents
[17:03:00] <Blackonrain> So my live database is down, not even sure if it needs repairs. The guy who set up the server has since left. Awesomely enough, the dev who set up mongo apparently put some info about it on trac before he left
[17:03:23] <Blackonrain> and trac is throwing a shit ton of exceptions
[17:07:39] <sapht> i get this message a ton in my mongod logs: "query not recording (too large)" -- but running getlasterror after the failing queries seem to yield {ok:1.0 ...}, am i missing something?
[17:08:23] <TecnoBrat> its not an error, its just not going to print the whole entire query to the log, since its large
[17:08:41] <TecnoBrat> it stops your logs from becoming bloated / spamming
[17:09:09] <sapht> ah, alright.. kinda weird though, the query is relately small, i run another one as many times which is larger
[18:19:55] <ranman> TecnoBrat: a lot of this is available in the changelogs if you're interested in specifics
[18:29:41] <TecnoBrat> ranman: I read the release notes, but it doesn't have details (besides linking to the tickets, which doesn't have a lot of detail)
[18:30:04] <TecnoBrat> is there a changelog somewhere else I'm missing?
[18:31:35] <ranman> TecnoBrat: you looked at this? http://docs.mongodb.org/manual/release-notes/2.2/
[18:37:20] <TecnoBrat> yea, it links to the tickets .. which I just looked through the commits, gave me a little more insight
[18:39:48] <ranman> TecnoBrat: 2.2: better aggregation stuff, DB level locking and better pagefault management, data center tags, improved queryoptimizer, new read preference semantics, TTL collections, oplog replay, 16MB documents, shell unicode support/multi-line support, bulk inserts, … a bunch of cool stuff
[19:17:58] <junipr> hello. im wondering, how would i update, $set a property, on all documents using that doc's current value for that property? simply, all my documents have a timestamp, but it's not an actual javascript date object. i want to update all docs, and make it a date object.
[19:18:30] <junipr> i can do this in a loop in code, but was wondering if this is possible via mongo shell using a simple update?
[19:20:00] <warz> i guess its just a js repl, so whatever, ill just write a loop
[19:31:13] <ConstantineXVI> how do you know if a secondary in a replica set is taking queries? setting slaveOk in drivers, but not seeing the slave register queries/sec
[19:33:22] <chubz> when a mongodb node goes down, is there a way that a new node is automatically created and replaces the dead node?
[19:34:54] <jY> chubz: you'd be to build the logic to do that
[19:35:15] <jY> adding a new replicaset member or shard member is easy
[19:35:38] <chubz> i know i'm reading up on replication architecture in the docs
[19:35:49] <chubz> but i'mwondering how it can do it on its own, cause so far it looks like its manual?
[19:35:58] <jiffe98> is https://github.com/mikejs/gridfs-fuse/ still the best implementation of a fuse client for gridfs?
[19:36:08] <jY> there is no way in mongo to do it automatically
[19:37:41] <chubz> jY: i guess i'm just going to have to write a script for that then
[19:55:26] <svm_invictvs> Does MongodDB support ordered keys?
[20:16:40] <TecnoBrat> ranman: yea, I knew about the changes in general .. was looking mainly at the details on the locking in general. I figured it out now though
[20:17:17] <TecnoBrat> the aggregation stuff is wicked
[20:17:59] <TecnoBrat> still may not be as fast as we need, we'll see (we are collecting lots of stat counters, and require lots of numerical sums and groups on keys)
[20:18:14] <TecnoBrat> but its a TON faster than the map reduce interface
[20:18:53] <TecnoBrat> but, from reading tickets .. looks like there is a lot of room for improvement still performance wise
[20:29:28] <tystr> pre-aggregation is the way to go for sums and the like
[20:30:48] <javvierHund> TecnoBrat: what numbers have you got?
[20:31:01] <TecnoBrat> tystr: as in aggregating down into smaller pre-aggregated collections?
[20:31:26] <TecnoBrat> javvierHund: as in number of documents, etc? or number of fields?
[20:31:53] <javvierHund> i mean whats the performance
[20:33:23] <TecnoBrat> well our dataset is 110 million documents, each document has 9 keys, and up to 41 incrementing fields
[20:33:57] <javvierHund> and whats the performance?
[20:34:37] <TecnoBrat> we have tried some different DBs, like columnar DBs, and we can do aggregation on the entire data set and group / sum the columns in 100ish seconds
[20:34:43] <TecnoBrat> I'm about to run some tests on mongo
[20:34:47] <TecnoBrat> but my gut says, its not that fast
[20:40:18] <javvierHund> its impossible to run say a control chart on preaggregated data ;)
[20:40:43] <tystr> we're doing ours almost exactly the way that's described here: http://docs.mongodb.org/manual/use-cases/pre-aggregated-reports/#pre-allocate-documents
[20:40:44] <TecnoBrat> for a simple use case, say we were tracking 4 keys of: source, subsource, destination, and country
[20:41:04] <TecnoBrat> we would then pre-aggregate down to source, subsource, destination, and drop country
[20:45:34] <javvierHund> that kind of means you are trolling
[20:45:47] <TecnoBrat> well tystr our app handles 30,000 requests a second, and mongo handles the updates of the stats just fine (we aggregate in the app as well before we do upserts, so thats not 30,000/s to mongo)
[20:46:08] <TecnoBrat> our aggregation technique has worked well for us so far
[20:46:09] <mw44118> hi -- i'm trying to save a mongo instance that has a CPU that's overwhelmed
[20:46:31] <TecnoBrat> and with the new aggregation changed, I think it will solve some short term issues
[20:46:51] <mw44118> I ran db.currentOp(), and I see a lot of queries that I would like to terminate, because at this point, there's no reason to still run them (they are from webservers). How to do that?
[20:46:54] <TecnoBrat> (the preaggregation technique was actually taking too long to update the sub-collections we use)