pmxbot IRC Log Viewer

[00:17:15] <tystr> I know you can use regex to query,

[00:17:31] <tystr> but is it possible to use capture groups to do an update find/replace ?

[00:18:23] <tystr> e.g. I need to replace part of a string value in a bunch of documents

[00:37:00] <jiffe1> is https://github.com/mikejs/gridfs-fuse/ still the best implementation of a fuse client for gridfs?

[00:50:30] <Honeyman> Hello. Need some hints regarding map/reduce performance and RT usage.

[00:50:37] <Honeyman> 1. As far as I understand, there are limitations in map/reduce which make it generally unusable for "real time" queries. That is, it is generally more intended as, say, an administration tool, rather than a system that can be called on every user request, or even once in a minute/several minutes. Please correct me if I am wrong.

[00:50:58] <Honeyman> 2. The main reason of this is the limitation that only 1 thread of map/reduce can be run at a time (not speaking about sharding, etc). The usual "key-value" queries are not limited with it. Please correct me, again (say, if there is some more important factor than that).

[00:54:57] <Honeyman> 3. Are these issues limiting only, say, the db.col.mapReduce() function, or are there any other subsystems which share the map/reduce internals and hence also should be assumed "non-realtime-safe"? db.col.count(), db.col.distinct(), db.col.group()? Does the aggregation framework from 2.2 has the same limitations as mapReduce()?

[00:55:02] <Honeyman> 4. Any known roadmap when the mapReduce can be considered "generally ready for realtime"?

[00:57:12] <Honeyman> In particular, isn't SERVER-4258 expected to solve precisely this issue?

[01:54:54] <Guest70203> why would a database created from within an app not be accessible when using the mongo client?

[01:55:52] <Guest70203> i can still use the app to access the db and various db related files can be seen in /data/db. but it is as if the db doesn't exist when i connect to mongod with the mongo client

[05:17:29] <pr0ton> btw it seems like mongodb doesn't use all the memory that the box has to offer

[05:17:33] <pr0ton> i want it to use as much as it can

[05:17:36] <pr0ton> can i do that somehow?

[05:20:02] <wereHamster> pr0ton: it automatically uses as much as there is

[05:20:40] <pr0ton> using only 30.9

[05:20:48] <pr0ton> which is 5g on the machine

[05:20:49] <pr0ton> s

[05:20:50] <pr0ton> om

[05:20:51] <pr0ton> e

[05:20:51] <pr0ton> la

[05:20:52] <pr0ton> some hard limit?

[05:20:55] <pr0ton> sorry

[05:25:41] <IAD> pr0ton: mongodb used mmap(2) function of the Operating system to map files into memory. memory will be released auto

[05:25:53] <pr0ton> so it has a lot of CPU

[05:25:54] <pr0ton> d

[05:25:55] <pr0ton> o

[05:25:57] <pr0ton> es i

[05:25:58] <pr0ton> t

[05:25:59] <pr0ton> mea

[05:26:00] <pr0ton> n

[05:26:01] <pr0ton> fuck sorry

[05:26:13] <pr0ton> does it mean its blocked on disk IO or on RAM access?

[05:27:33] <IAD> pr0ton: http://en.wikipedia.org/wiki/Mmap

[07:29:57] <[AD]Turbo> hola

[08:06:16] <trupheenix> anyone here who can help me with tx mongo?

[08:09:17] <mids> trupheenix: using pymongo and asyncmongo myself; considering switching to motor once it gets more stable

[08:09:21] <mids> what is up with txmongo?

[08:11:29] <trupheenix> mids, what is asyncmongo ?

[08:11:36] <trupheenix> mids how do you use it with pymongo?

[08:12:12] <trupheenix> mids, ?

[08:12:41] <trupheenix> mids, i want to use txmongo since my app uses twisted

[08:13:20] <mids> ah

[08:13:55] <mids> asyncmongo is an asynchronous python monogdb lib by bit.ly

[08:14:28] <mids> but yeah, only using it with tornado, not twisted

[08:14:37] <mids> so what is up with txmongo? what error do you get?

[08:15:23] <trupheenix> mids, i cannot figure out how to do a find() like in pymongo where i can do a find({"id":"1234"}) to get the entire document associated with that id

[08:16:13] <mids> trupheenix: can you pastebin your current code?

[08:16:56] <trupheenix> mids, http://pastebin.ca/2170378

[08:18:58] <mids> trupheenix: 4ffd3b65140f5669b3f35f13 is an objectid?

[08:19:05] <trupheenix> mids, yea

[08:19:49] <mids> trupheenix: so the 'fields' attribute is to specify which fields to return; not what to query on

[08:19:55] <trupheenix> mids, ok

[08:20:06] <trupheenix> mids, i just got disconnected did i miss any other messages from you?

[08:20:24] <mids> if you query for an object id; you most likely have to use pymongo.objectid.ObjectId("4ffd3b65140f5669b3f35f13") (or bson.objectid in newer versions of pymongo)

[08:20:51] <trupheenix> mids, i tried using ObjectId. it didn't work. it gave error

[08:21:21] <mids> now to query; just use: apps.find({"_id": blabla})

[08:24:25] <trupheenix> mids ok trying

[08:25:32] <trupheenix> mids, when i do docs = yield apps.find({"_id":"4ffd3b65140f5669b3f35f13"})

[08:25:39] <trupheenix> it returns nothing

[08:25:47] <trupheenix> however if i do docs = yield apps.find(fields={"_id":"4ffd3b65140f5669b3f35f13"})

[08:25:53] <trupheenix> i get atleast the document

[08:26:49] <mids> yeah, you do a query for all elements and then return the _id field

[08:27:04] <mids> the value 4ffd is actually ignored for the fields attribute

[08:28:03] <trupheenix> mids, yes i get it

[08:28:15] <trupheenix> but if i do just the previous statement it returns nothing

[08:29:23] <mids> lemme install txmongo

[08:31:08] <mids> http://pastebin.ca/2170386

[08:31:13] <mids> had to change the objectid for my own test

[08:31:25] <mids> so I am importing: from txmongo._pymongo import objectid

[08:31:35] <mids> and then use it like this: apps.find({"_id": objectid.ObjectId("4fffdc2ba6e9c8a00bbc19a7")})

[08:32:14] <mids> some of the txmongo tests are useful as an example:

[08:32:14] <mids> https://github.com/fiorix/mongo-async-python-driver/blob/master/tests/test_objects.py

[08:33:53] <trupheenix> mids ah i figured it out

[08:34:02] <trupheenix> mids, i see the mistake

[08:36:04] <mids> nice :)

[10:25:51] <naiquevin> hi, I am connecting to mongodb from a django app using pymongo library. I create an instance of the connection object as a module-global due to which the connection leaves open between requests ie the mongod logs just shows "connection accepted". How bad it is from mongodb point of view to have an open connection like this or should I be connecting on a per request basis ?

[10:29:52] <unknown_had> Hey all how i may get ID of last document added/last insert id.

[10:55:58] <NodeX> anyone alive who knows the gridfs part fo the php driver alive

[10:58:10] <NodeX> the docs are sparse on whether one needs to close the connection after getBytes()

[11:01:03] <ranman> ping Derick

[11:03:43] <Derick> ranman: yes?

[11:03:55] <ranman> Did you see the NodeX question?

[11:03:57] <NodeX> [11:56:26] <NodeX> anyone alive who knows the gridfs part fo the php driver alive

[11:04:02] <NodeX> [11:58:38] <NodeX> the docs are sparse on whether one needs to close the connection after getBytes()

[11:04:29] <Derick> I'll leave you in the capable hands of bjori :-)

[11:05:49] <Derick> hmm, not around. NodeX, I don't know what happens.

[11:06:06] <Derick> I can check, but I'm afraid that will have to wait until I've this code working :)

[11:06:22] <NodeX> the examples I have seen dont have it on anything to do with gridfs

[11:06:47] <NodeX> so I'm assuming no but I don't want to find my DB choaked on pools

[11:07:00] <Derick> "on pools" ?

[11:07:59] <NodeX> pools of connections that are not severed

[11:08:27] <Derick> oh, no

[11:29:59] <remonvv> Today is a historic day for MongoDB.

[11:31:58] <Derick> is it?

[11:34:00] <ranman> remonvv: why?

[11:34:00] <remonvv> YES!

[11:34:12] <remonvv> Have you checked your ObjectIds today?

[11:34:19] <remonvv> As of today they start with a 5!

[11:34:35] <remonvv> It's right up there with the moonlanding really

[11:35:24] <ranman> oh… cool?

[11:35:37] <neil__g> that's simultaneously trivial and cool

[11:36:12] <remonvv> But mostly cool

[11:36:44] <NodeX> +1

[11:36:53] <NodeX> what day do they start with a 6 ?

[11:37:03] <remonvv> I was expecting 10gen to have some sort of huge marketing push about it...but no

[11:37:09] <NodeX> satuurday (being the 6th day) :P

[11:37:11] <remonvv> I tweeted about it.

[11:37:14] <remonvv> I have 20 followers

[11:37:16] <remonvv> :(

[11:37:25] <NodeX> I'll get you some more

[11:37:34] <NodeX> gimme the twitter page

[11:37:50] <remonvv> It'll switch in 0x10000000 seconds to 6 I'm assuming

[11:38:15] <remonvv> @emremon

[11:39:13] <bjori> retweeted :)

[11:40:27] <NodeX> retweeted ;)

[11:40:34] <remonvv> Yep, i see this trending in hours

[11:43:33] <NodeX> nice spot too

[11:45:02] <remonvv> well I reluctantly admit I noticed and then immediately jumped to the conclusion I broke something.

[11:45:11] <bjori> lol

[11:51:07] <Derick> remonvv: :-)

[11:51:25] <remonvv> I thought this entire channel would be all excited and abuzz with this. But it's a little underwhelming I must admit.

[11:51:35] <Derick> i was making lunch

[11:51:36] <remonvv> Derick, is there a 10gen "The Big 5" party today?

[11:51:49] <remonvv> lunch? how can you have time for lunch? aren't you preparing speeches?

[11:52:01] <Derick> actually, I am writing connection handling code

[11:52:10] <remonvv> We need to talk about your priorities sir.

[11:52:28] <Derick> talk to steve :-)

[11:52:36] <remonvv> Although if you're doing it for the Java driver I admit that's useful too.

[11:52:39] <remonvv> Who's steve?

[11:53:16] <Derick> http://spf13.com/

[11:55:11] <remonvv> Alright, so he doesn't think 5-day is a big thing either?

[11:55:31] <Derick> i don't think so :-)

[11:55:53] <remonvv> That's it, i'm moving to CouchDB

[12:02:29] <W0rmDrink> what 5-day ?

[12:03:00] <Derick> W0rmDrink: php -r 'var_dump( new MongoID() );'

[12:03:06] <Derick> string(24) "50000e8944670aec51000000"

[12:03:30] <W0rmDrink> that means very little to me

[12:03:32] <W0rmDrink> actually

[12:03:48] <Derick> it starts with a 5 since today

[12:04:06] <W0rmDrink> and this is a problem ?

[12:04:12] <remonvv> W0rmDrink, please take a moment to appreciate the significance of this day

[12:04:12] <W0rmDrink> do they repeat ?

[12:04:18] <remonvv> No, it's just cool.

[12:04:20] <remonvv> Dude..

[12:04:26] <remonvv> I give up.

[12:04:27] <Derick> W0rmDrink: no, no problem. Just entertaining :-)

[12:04:45] <W0rmDrink> I dont get ther significance of this day ?

[12:05:02] <Derick> it's Friday the 13th

[12:05:08] <W0rmDrink> oh

[12:05:08] <NodeX> haha

[12:05:23] <NodeX> 13th - Jason (JSON) :P

[12:05:36] <NodeX> (horror films)!!

[12:05:44] <W0rmDrink> remonvv, and this is seriously bad for you ?

[12:05:50] <remonvv> No it's seriously awesome!

[12:06:05] <W0rmDrink> but why are you moving to CouchDB ?

[12:06:28] <remonvv> I don't get why people get all excited about eclipses and whatnot but when ObjectIds switch from 4 to 5, which is a once in every 8 years event, everybody's like "so..wait..is that like the Y2K bug?"

[12:06:50] <NodeX> Y5k :P

[12:06:53] <Derick> remonvv: actually, the leap second issue was more troublesome than the y2k issue :-)

[12:07:07] <remonvv> This is the first time since MongoDB was released that ObjectIds start with a 5

[12:07:10] <remonvv> Exclamation point

[12:07:13] <Derick> !

[12:07:14] <remonvv> Derick, well, fair enough

[12:07:24] <W0rmDrink> Derick, yeah, that leap second was a real pain

[12:10:09] <NodeX> the leap second killed my servers and all my clients servers :/

[12:14:55] <remonvv> Leap seconds should have been removed ages ago.

[12:15:14] <remonvv> At least for computing purposes and UTC delta calculations

[12:15:46] <algernon> we didn't have them ages ago to begin with

[12:28:07] <remonvv> Smartass :s

[12:29:59] <remonvv> Anyone know of any resources online that can help me do some availability/MTBF calculations for full (MongoDB) clusters?

[13:18:46] <trupheenix> i'm using pymongo. how do i do stored procedures from it?

[13:29:58] <remonvv> MongoDB doesn't have stored procedures

[13:49:37] <remonvv> ha, success : https://twitter.com/10gen/status/223774533579063297

[13:53:53] <NodeX> heh

[14:17:03] <TkTech> Heyo. My google-fu is failing me - is there any documentation on profiling and debugging MapReduce?

[14:18:42] <TkTech> In particular we may have hit an issue where a MapReduce op decides to take the read lock and never releases it

[14:40:21] <ranman> TkTech: as a quick hack you can check the rest interface it has some locking stats

[14:52:10] <sinisa> hi all, is 2.2 development build avaliable anywhere

[14:52:27] <maruq> hi guys

[14:52:54] <maruq> is there any way to configure replica sets directly through a config file?

[14:53:09] <maruq> eg. stick everything in the mongodb.conf & it'll all just work?

[14:53:33] <maruq> I see hints towards it in the docs, but can't seem to find a real example

[14:54:28] <sinisa> windows 2.2 build anyone ?

[15:00:20] <ranman> sinisa: we should have that on nightlies?

[15:01:09] <ranman> sinisa: http://downloads.mongodb.org/win32/mongodb-win32-x86_64-latest.zip

[15:04:27] <sinisa> ranman... i think i downloaded that.. its 2.1.3

[15:04:43] <TkTech> ranman: Hm, 2.1.2 seems to add "lock" to mongostat. Upgrade time.

[15:05:29] <sinisa> 2.1.3 is not 2.2.0 .. isnt it ? :)

[15:08:39] <ranman> sinisa: 2.1.3 == 2.2.0 missing a few features and not as tested

[15:10:55] <sinisa> ranman, when will be 2.2 release avaliable

[15:11:16] <ranman> No release date yet, an RC hopefully soon?

[15:11:28] <ranman> you can checkout master from github and compile it if you want

[15:11:28] <sinisa> rc1 like 19.07

[15:11:49] <sinisa> i dont trust my compiling :)

[15:12:03] <sinisa> whats needed to compile

[15:12:38] <warz> hi all. how would i query the last 10 records, for example, out of a collection, but still keep them in the natural sorting order, which is ascending for an _id field, i believe? i think i'd need to skip documents, but how would you dynamically skip the correct amount of documents?

[15:14:15] <warz> or does limit work from the end of the cursor?

[15:14:20] <warz> hm

[15:21:35] <NodeX> warz

[15:21:43] <NodeX> skip(10)

[15:22:06] <NodeX> limit(10).skip(10); iirc

[15:22:46] <warz> i understand that part. i think what i need to do is something like this, though: db.collection.find().skip(db.collection.count() - 10)

[15:22:49] <warz> that seems to work

[15:22:58] <warz> but idk if its going to pwn my db or not, heh

[15:23:01] <NodeX> http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-%7B%7Bskip%28%29%7D%7D <----- all in the docs

[15:23:46] <warz> heh, i know of the skip function, thx tho

[15:23:54] <NodeX> I really wouldn't count everytime you need to skip

[15:23:55] <warz> anyway this seems to work, just not sure if its the "best" way

[15:23:57] <sinisa> warz, you want to return last 10

[15:24:26] <warz> sinisa, yes

[15:25:09] <sinisa> sort it different , use limit 10 :)

[15:25:35] <warz> wouldnt the result set be sorted descending, then? i guess i could resort in my code.

[15:25:35] <NodeX> $natural : -1

[15:25:37] <sinisa> i think sort:natural is in order that is wroted to db.. so

[15:25:56] <NodeX> sort({$natural:-1}).limit(10);

[15:26:17] <sinisa> if that works.. its the solution :)

[15:26:31] <NodeX> of course it works lol

[15:26:57] <warz> nice, that does it

[15:27:05] <sinisa> i dont know.. you didnt wrote it until i told it ;)

[15:27:12] <NodeX> you can probably sort({_id:-1}); seeing as it's timestamp based too

[15:27:29] <sinisa> but its not timestamp

[15:27:29] <NodeX> not sure if you'll need to hint() the $natural -1 or not

[15:27:41] <NodeX> sinisa : it's timestamp based

[15:27:52] <sinisa> based, not :)

[15:28:09] <NodeX> I said based

[15:28:23] <NodeX> [16:27:39] <NodeX> you can probably sort({_id:-1}); seeing as it's timestamp based too <------ "based"

[15:29:10] <sinisa> lol

[15:29:23] <NodeX> what's so funny?

[15:29:42] <sinisa> dont know.. maybe that "based" doesnt mean it is, its only based :)

[15:29:59] <sinisa> string can be numeric based :)

[15:30:16] <NodeX> I didn't say it was a timestamp

[15:30:26] <sinisa> lol

[15:30:28] <NodeX> but the ObjectId is based on a timestamp and ergo can be sorted on it

[15:30:46] <NodeX> I guess you don't understand how it's constructed, no problems

[15:30:59] <sinisa> you dont understand that im joking, relax man :)

[15:31:30] <NodeX> I am relaxed

[15:31:37] <sinisa> good :)

[15:31:52] <NodeX> you're joking around which is putting ideas in people's heads who dont really understand Mongo all that well

[15:31:58] <NodeX> and it's gets confusing for them

[15:32:25] <sinisa> my joking produces your solution.. you surely understand that

[15:32:37] <sinisa> and warz can do its work now

[15:32:55] <NodeX> your joking did not produce my solution

[15:33:04] <NodeX> your joking confused the solution

[15:33:53] <sinisa> i see. we post at almost the same time $natural thingy.. ok , you are the hero :)

[15:35:41] <NodeX> kid, what are you talking about?

[15:35:49] <NodeX> do you think I care who was right firstc?

[15:35:51] <sinisa> kid have 40 years.. so be nice

[15:35:59] <NodeX> cool story kid

[15:36:06] <NodeX> I am in awe

[15:36:14] <NodeX> try acting your age ;)

[15:36:19] <sinisa> i dont talk to kids, sorry ;)

[15:36:26] <sinisa> Bofu2U :)

[15:36:30] <NodeX> then /ignore kid

[15:36:41] <sinisa> you should do that.. but you want to play ;)

[15:36:45] <Bofu2U> first time I'm looking in this room ... ever. Can't wait to watch it more frequently now. :)

[15:36:52] <sinisa> lol

[15:36:59] <sinisa> all about mongo isnt it :)

[15:37:04] <NodeX> I dont waste my time with kids

[15:37:07] <Bofu2U> oh yeah, I can tell

[15:37:13] <Bofu2U> you guys are obviously just discussing indicies.

[15:37:26] <sinisa> i waste my time with kids, what kind of father will i be! :)

[15:37:31] <NodeX> Bofu2U : It's supposed to be about Mongo, this kid thinks that he can confuse users and that it's funny

[15:37:48] <Bofu2U> I take it I should scroll up and see what actually happened

[15:37:50] <NodeX> and I am trying to point out that it's not a good idea to do that

[15:37:51] <sinisa> as i said.. im too old to be kid :)

[15:38:02] <NodeX> and as I said. Act your age

[15:38:10] <NodeX> Ignored

[15:38:12] <sinisa> i am

[15:38:20] <sinisa> but you cant stop :)

[15:39:08] <sinisa> aggregation is map/reduce or something different ?

[15:39:24] <sinisa> i mean, behind that wrapname

[15:39:26] <diegok> hello. Which is the best way to replace a collection?. I mean, I have a collection with 30M docs and a dump with different 7M. I want to switch the faster I can. I was about to test renaming the existing one and then restore the dump... it's ok?, is there a better way?

[15:39:49] <Bofu2U> sinisa, different.

[15:39:55] <NodeX> diegok : rename it

[15:40:07] <diegok> NodeX: thanx

[15:40:12] <NodeX> yw ;)

[15:40:15] <sinisa> Bofu, faster, not blocking etc ?

[15:40:57] <markgm> Hello everyone. I was wondering if there is any support in doctrine2/zend for flexible schema document collections?

[15:41:06] <Bofu2U> iirc, it's run on the code level rather than through JS

[15:41:08] <NodeX> diegok : you will have to rename the dump folder too

[15:41:34] <Bofu2U> sinisa, http://docs.mongodb.org/manual/applications/aggregation/

[15:41:37] <NodeX> markgm : how do you mean?

[15:41:47] <diegok> NodeX: so, I rename the dump folder, then load it, then rename existing and then rename the just loaded. Right?

[15:41:58] <NodeX> you will have to rename both

[15:42:07] <NodeX> or rename the collection then dump

[15:42:17] <diegok> NodeX: I'll need to re-create indexes?

[15:42:57] <diegok> when I rename the existing one it will keep indexes and the new dump has no idx info

[15:43:11] <NodeX> diegok : you'll have to test it

[15:43:19] <diegok> well, I'll test it :)

[15:43:23] <NodeX> I cannot remeber if a rename - renames the indexes too

[15:43:25] <diegok> eheh

[15:43:26] <NodeX> I think it does

[15:43:27] <markgm> NodeX: I am looking to store documents with a few known fields, however, the rest will be determine by a client admin and will fluctuate from subdomian. So I am looking to use persistent collections in Doctrine with a flexible schema. Is there any way to do this? Or would it be better to rely on Mongo's driver

[15:43:45] <diegok> NodeX: I'll tell you in a moment. Thanx!

[15:43:59] <NodeX> I wasn't aware that drivers were forcing schemas in a schema free datastore to be honest

[15:44:07] <NodeX> seems kinda against the grain if you ask me

[15:44:11] <Bofu2U> some do

[15:44:16] <Bofu2U> most can be disabled, though

[15:44:34] <markgm> that's what I have been thinking to

[15:44:35] <sinisa> i got invalid operator $unwind

[15:45:10] <NodeX> markgm : I wrote my own wrapper to go around the driver to make light of most things.. took around a day but it gives greate flexibility

[15:45:18] <sinisa> bofu https://gist.github.com/993733

[15:45:19] <NodeX> perhaps if you have time write one too ?

[15:45:28] <markgm> it seems like flexible schema is one of the biggest pluses of mongo but every example I have seen is a forced/expected schema

[15:45:28] <sinisa> trying p2..

[15:46:05] <NodeX> yes it's a bit sad that they force these things

[15:46:24] <markgm> NodeX: Yes, I have pretty much come to the conclusion that I would have to write my own. But I didn't want to waste the time if there was one out there, so I thought I'd come here first

[15:46:25] <NodeX> perhaps as Bofu2U said there is an override?

[15:46:32] <NodeX> php ?

[15:46:50] <NodeX> you're welcome to mine if you like for ideas

[15:46:51] <Bofu2U> sinisa, shouldn't be in the same block as project I don't think.

[15:47:20] <markgm> NodeX: ya, that would be great if it's publicly hosted or you can send it to me

[15:47:29] <NodeX> I'll bung it somewhere for you

[15:47:38] <markgm> awesome

[15:47:42] <NodeX> are you a proficient programmer - it's not well documented!

[15:48:07] <sinisa> bofu.. yes

[15:48:11] <Bofu2U> Random Q for whoever may know: is it possible to take a record, divide a numeric value on each by a constant, and then sort the output by the result?

[15:48:21] <markgm> Yes, I'll make do. It's not like I have anything else to look at haha.

[15:48:24] <Bofu2U> as in... if there were 100,000 records matching the query.

[15:49:07] <sinisa> bofu .. looks like Redis job :)

[15:49:18] <remonvv> Bofu2U, no, $inc is the only in-place mathematical operator currently supported

[15:49:24] <remonvv> for some reason..

[15:50:03] <Bofu2U> Dammit.

[15:50:10] <Bofu2U> Back to SQL for that query for me then, heh.

[15:50:25] <NodeX> markgm : i'll put some usage examples at the bottom ..

[15:50:44] <markgm> NodeX: Cool, that would be very helpful

[15:51:03] <sinisa> i have also "back to sql" situation :)

[15:51:35] <Bofu2U> sinisa, yeah. I have north of 100,000,000 in a collection and need to sort by "relevance to now" aka number divided by timestamp, etc.

[15:52:31] <sinisa> i have alot products that needs some JOIN and also some calculations

[15:53:11] <sinisa> ofc, cant be done with mr, couse that will be unhuman :)

[15:53:43] <dob_> How can I sort ignoring the case?

[15:54:14] <sinisa> sorting is affecte by case?

[15:54:22] <dob_> sinisa: yes

[15:54:40] <sinisa> Dob, dob ?

[15:55:52] <dob_> http://osdir.com/ml/mongodb-user/2010-02/msg00000.html

[15:56:10] <diegok> NodeX: should I rename folder or bson file?... can I restore it on another db and then rename db?

[15:59:30] <NodeX> folder

[16:00:23] <diegok> NodeX: so, it will be restored on a new db...

[16:00:37] <diegok> right?

[16:01:31] <NodeX> it should be

[16:01:47] <NodeX> you're going to have to try it on a test collection and see

[16:01:59] <diegok> NodeX: ok, restoring.

[16:06:00] <dob_> any idea how i can ignore the case when sorting?

[16:06:23] <NodeX> you can't

[16:06:37] <NodeX> or store it as one case!

[16:06:57] <diegok> dob_: you need to add other field for sorting...

[16:07:34] <dob_> diegok: Thats shit

[16:07:47] <diegok> dob_: :-/

[16:07:49] <sapht> hello, how do i run getLastError from pymongo? neither db.runCommand nor db.getLastError is a callable

[16:07:54] <dob_> :-)

[16:08:30] <sapht> and the documentation just says to pass "safe=True" to save/update and that options propagate to the "resultant getLastError command"

[16:08:38] <sapht> yeah, but where's that command? i can't find it documented T_T

[16:10:31] <sapht> ah... db.command({"getlasterror":1}) it would seem, let's hope i haven't got this backwards

[16:21:00] <diegok> NodeX: ok, rename can move from one db namespace to another and it also move/rebuild indexes. But it takes long. Renaming on the same db is instantaneous.

[16:21:07] <TheEmpath> hi… if mongo 2.0.6 is fsynching, and another connection tries to write, what does mongo do to the other connection?

[16:22:34] <groundnuty> hey, is it possible to use mongodb without runnign a server? - similar to sqllite? if not you gyus know any nosql dbs like that?

[16:22:59] <algernon> not possible, no.

[16:24:35] <algernon> there are a couple of no-server key-value stores and the like (kyotocabinet, for example)

[16:51:39] <TecnoBrat> Hows the rc0 release coming along? is there going to be an actually packaged release, or is it basically just the nightlies?

[16:57:27] <Blackonrain> hello

[16:59:09] <TheEmpath> hail

[16:59:23] <Blackonrain> My I ask questions?

[16:59:28] <Blackonrain> or is that shunned upon?

[16:59:33] <TecnoBrat> you just did

[16:59:47] <Blackonrain> was there any shunning?

[16:59:57] <Derick> we can if you insist!

[17:00:02] <TecnoBrat> hah

[17:00:05] <TheEmpath> You came here for an argument!

[17:00:40] <Blackonrain> I came for freedom..........freedom from opression. Freedom to make love and not war. Freedom to streaming porn

[17:00:53] <Blackonrain> or.......mongodb help

[17:00:53] <TecnoBrat> the new aggregation library is pretty sexy

[17:01:03] <TecnoBrat> err not library

[17:01:05] <TecnoBrat> you know what I mean

[17:01:11] <TecnoBrat> wow .. not enough coffee

[17:02:13] <TecnoBrat> the question is going to be how does it perform on 110,981,462 documents

[17:03:00] <Blackonrain> So my live database is down, not even sure if it needs repairs. The guy who set up the server has since left. Awesomely enough, the dev who set up mongo apparently put some info about it on trac before he left

[17:03:23] <Blackonrain> and trac is throwing a shit ton of exceptions

[17:07:17] <Blackonrain> good talk

[17:07:25] <sapht> lol

[17:07:39] <sapht> i get this message a ton in my mongod logs: "query not recording (too large)" -- but running getlasterror after the failing queries seem to yield {ok:1.0 ...}, am i missing something?

[17:08:23] <TecnoBrat> its not an error, its just not going to print the whole entire query to the log, since its large

[17:08:41] <TecnoBrat> it stops your logs from becoming bloated / spamming

[17:09:09] <sapht> ah, alright.. kinda weird though, the query is relately small, i run another one as many times which is larger

[17:09:23] <TecnoBrat> but does it log that one?

[17:09:34] <TecnoBrat> we have queries which are like tiny .. and they show up the same way

[17:09:35] <sapht> it's a findandmodify command, is it the find or the update that needs log?

[17:09:39] <TecnoBrat> the limit is really small I think

[17:09:55] <sapht> yeah i read it was 256 bytes, but didn't know what it meant ^^

[17:10:29] <sapht> what determines the log? it's basically 2 queries in a row, going to different collections in the same db, with safe=True for both

[17:11:10] <sapht> i don't see any logged queries at all on the mongod tty, just these errors

[18:07:28] <TecnoBrat> I'm having a hard time tracking all of the locking / concurrency tickets

[18:08:06] <TecnoBrat> Curious what improvements have been done, my understanding is that in 2.0 ... all locks are global to the server process

[18:08:17] <TecnoBrat> is there DB / collection / etc level locking in place now?

[18:10:52] <vptr> Hey guys

[18:10:55] <vptr> what does eplSet syncThread: 1000 replSet source for syncing doesn't seem to be await capable -- is it an older version of mongodb?

[18:11:01] <vptr> what does this mean

[18:11:05] <vptr> I can't find anything at all

[18:11:27] <vptr> and one of my members is not replicating at all

[18:11:48] <TecnoBrat> (talking about 2.2, which I know isn't complete, I'm just trying to plan if we should stay with mongo, or move away)

[18:12:21] <TecnoBrat> we have 110 million documents in it .. and we need aggregation and better concurrency

[18:19:36] <ranman> TecnoBrat: db level locking

[18:19:55] <ranman> TecnoBrat: a lot of this is available in the changelogs if you're interested in specifics

[18:29:41] <TecnoBrat> ranman: I read the release notes, but it doesn't have details (besides linking to the tickets, which doesn't have a lot of detail)

[18:30:04] <TecnoBrat> is there a changelog somewhere else I'm missing?

[18:31:35] <ranman> TecnoBrat: you looked at this? http://docs.mongodb.org/manual/release-notes/2.2/

[18:37:20] <TecnoBrat> yea, it links to the tickets .. which I just looked through the commits, gave me a little more insight

[18:39:48] <ranman> TecnoBrat: 2.2: better aggregation stuff, DB level locking and better pagefault management, data center tags, improved queryoptimizer, new read preference semantics, TTL collections, oplog replay, 16MB documents, shell unicode support/multi-line support, bulk inserts, … a bunch of cool stuff

[19:17:58] <junipr> hello. im wondering, how would i update, $set a property, on all documents using that doc's current value for that property? simply, all my documents have a timestamp, but it's not an actual javascript date object. i want to update all docs, and make it a date object.

[19:18:30] <junipr> i can do this in a loop in code, but was wondering if this is possible via mongo shell using a simple update?

[19:20:00] <warz> i guess its just a js repl, so whatever, ill just write a loop

[19:20:01] <warz> easy enough

[19:31:13] <ConstantineXVI> how do you know if a secondary in a replica set is taking queries? setting slaveOk in drivers, but not seeing the slave register queries/sec

[19:33:22] <chubz> when a mongodb node goes down, is there a way that a new node is automatically created and replaces the dead node?

[19:34:54] <jY> chubz: you'd be to build the logic to do that

[19:35:15] <jY> adding a new replicaset member or shard member is easy

[19:35:38] <chubz> i know i'm reading up on replication architecture in the docs

[19:35:49] <chubz> but i'mwondering how it can do it on its own, cause so far it looks like its manual?

[19:35:58] <jiffe98> is https://github.com/mikejs/gridfs-fuse/ still the best implementation of a fuse client for gridfs?

[19:36:08] <jY> there is no way in mongo to do it automatically

[19:37:41] <chubz> jY: i guess i'm just going to have to write a script for that then

[19:37:49] <jY> yep

[19:55:26] <svm_invictvs> Does MongodDB support ordered keys?

[20:16:40] <TecnoBrat> ranman: yea, I knew about the changes in general .. was looking mainly at the details on the locking in general. I figured it out now though

[20:17:06] <ranman> Cool, glad to hear it.

[20:17:17] <TecnoBrat> the aggregation stuff is wicked

[20:17:59] <TecnoBrat> still may not be as fast as we need, we'll see (we are collecting lots of stat counters, and require lots of numerical sums and groups on keys)

[20:18:14] <TecnoBrat> but its a TON faster than the map reduce interface

[20:18:53] <TecnoBrat> but, from reading tickets .. looks like there is a lot of room for improvement still performance wise

[20:29:28] <tystr> pre-aggregation is the way to go for sums and the like

[20:30:48] <javvierHund> TecnoBrat: what numbers have you got?

[20:31:01] <TecnoBrat> tystr: as in aggregating down into smaller pre-aggregated collections?

[20:31:26] <TecnoBrat> javvierHund: as in number of documents, etc? or number of fields?

[20:31:39] <javvierHund> documents*fields

[20:31:43] <javvierHund> :)

[20:31:53] <javvierHund> i mean whats the performance

[20:33:23] <TecnoBrat> well our dataset is 110 million documents, each document has 9 keys, and up to 41 incrementing fields

[20:33:57] <javvierHund> and whats the performance?

[20:34:37] <TecnoBrat> we have tried some different DBs, like columnar DBs, and we can do aggregation on the entire data set and group / sum the columns in 100ish seconds

[20:34:43] <TecnoBrat> I'm about to run some tests on mongo

[20:34:47] <TecnoBrat> but my gut says, its not that fast

[20:35:04] <tystr> yeah

[20:35:05] <javvierHund> i just started experimenting with mongodb today

[20:35:06] <TecnoBrat> I'll let you know

[20:35:17] <javvierHund> but i am lead on a columnar db at work ;)

[20:35:23] <javvierHund> so thats why i am interested ;)

[20:35:30] <TecnoBrat> which one?

[20:35:34] <javvierHund> in-house

[20:35:34] <tystr> instead of running reports on tons of data, we pre-aggregate it by minute, hour, day, and month

[20:35:45] <javvierHund> focused on statistics

[20:35:56] <TecnoBrat> tystr: currently we preaggregate down to sets of keys

[20:36:00] <TecnoBrat> so it starts at 9 keys

[20:36:20] <javvierHund> but its more real-time-oriented so 110 million "rows" would be kind of the numbers you are saying above

[20:36:23] <TecnoBrat> we aggregate that removing one of them, aggregate on 8, which reduces the set down by a factor of like 4

[20:36:37] <TecnoBrat> and then we further aggregate

[20:36:59] <TecnoBrat> and when we run a report, the engine is smart enough to grab the collection with the least number of keys

[20:37:12] <javvierHund> TecnoBrat: the db i am using is worked in "causefindr" www.manifact.com / www.causefindr.com

[20:37:15] <TecnoBrat> the problem is our reporting OFTEN runs on months of data

[20:37:37] <tystr> if you pre-aggregate by months, pulling the reports is very light on the database

[20:37:37] <TecnoBrat> but generally doesn't need all of the keys, etc

[20:37:54] <TecnoBrat> but most of the time its grouped by day for us

[20:37:54] <tystr> as it doesn't have to query very many documents

[20:37:55] <javvierHund> you cant preaggregate if you want to do proper math

[20:38:29] <tystr> well you can pre-aggregate the counters for all the metrics, and to the math in the application

[20:38:35] <tystr> s/to/do/

[20:38:46] <javvierHund> in our application we often have dynamically linked paretos with user selected transformations and filters (in matrice form)

[20:39:00] <javvierHund> tystr: thats not applicable in many sollutions

[20:39:05] <TecnoBrat> tystr: so we took the other approach, we aggregate based on the grouping keys

[20:39:19] <TecnoBrat> pre-aggregate that is

[20:39:30] <javvierHund> if you can preaggragate you should (ofcourse)

[20:39:31] <tystr> true, our use case is mainly tracking impressions, clicks, conversions, revenue, etc

[20:39:43] <TecnoBrat> tystr: so is ours.

[20:39:55] <javvierHund> then its often OK to pre-aggregate

[20:40:00] <javvierHund> + save core data

[20:40:18] <javvierHund> its impossible to run say a control chart on preaggregated data ;)

[20:40:43] <tystr> we're doing ours almost exactly the way that's described here: http://docs.mongodb.org/manual/use-cases/pre-aggregated-reports/#pre-allocate-documents

[20:40:44] <TecnoBrat> for a simple use case, say we were tracking 4 keys of: source, subsource, destination, and country

[20:41:04] <TecnoBrat> we would then pre-aggregate down to source, subsource, destination, and drop country

[20:41:11] <TecnoBrat> etc etc

[20:41:17] <javvierHund> damn i hate the documentations on mongodb.org :D

[20:41:21] <TecnoBrat> because you won't always be running a country report, etc

[20:41:31] <javvierHund> they tend to be written by someone who studied economics rather than science

[20:41:50] <TecnoBrat> ahhh I see tystr

[20:41:57] <TecnoBrat> its subdocuments

[20:42:27] <TecnoBrat> with upsets updating all of the subdocument fields

[20:42:35] <tystr> yeep

[20:43:01] <tystr> we've just rebuilt our app on mongo, and we're gonna send traffic to it monday, so we'll see how it performs

[20:43:07] <javvierHund> :D

[20:43:16] <javvierHund> you should know how it performs before hand tho ;D

[20:43:17] <TecnoBrat> tystr: I am also in the advertising industry

[20:43:29] <TecnoBrat> we are probably doing similar stuff :P

[20:43:43] <tystr> TecnoBrat heh, probably :P

[20:43:48] <javvierHund> maybe you are at same company

[20:43:51] <tystr> lol

[20:43:59] <TecnoBrat> heh

[20:44:10] <TecnoBrat> maybe we are actually sitting next to eachother

[20:44:14] <TecnoBrat> MAYBE WE ARE THE SAME PERSON!

[20:44:18] <tystr> ZOMG

[20:44:19] <TecnoBrat> :P

[20:45:34] <javvierHund> that kind of means you are trolling

[20:45:47] <TecnoBrat> well tystr our app handles 30,000 requests a second, and mongo handles the updates of the stats just fine (we aggregate in the app as well before we do upserts, so thats not 30,000/s to mongo)

[20:46:08] <TecnoBrat> our aggregation technique has worked well for us so far

[20:46:09] <mw44118> hi -- i'm trying to save a mongo instance that has a CPU that's overwhelmed

[20:46:31] <TecnoBrat> and with the new aggregation changed, I think it will solve some short term issues

[20:46:51] <mw44118> I ran db.currentOp(), and I see a lot of queries that I would like to terminate, because at this point, there's no reason to still run them (they are from webservers). How to do that?

[20:46:54] <TecnoBrat> (the preaggregation technique was actually taking too long to update the sub-collections we use)

[20:47:10] <TecnoBrat> mw44118: db.killOp(*OPID*)

[20:47:17] <TecnoBrat> so like db.killOp(1213421432141)

[20:47:30] <mw44118> TecnoBrat: thanks!

[20:47:51] <mw44118> is it easy to figure out which queries are the ones that demand the most?

[20:48:11] <mw44118> in other words, which queries should be the first to die?

[20:48:41] <TecnoBrat> besides looking at them and looking at the running time, not really

[20:50:49] <mw44118> TecnoBrat: so, sex_running is the way to go

[20:51:36] <mw44118> how do i sort the db.currentOp() output by secs_running?

[20:57:07] <mw44118> is that even possible?

[20:57:22] <mw44118> Is it possible to filter the results from db.currentOp()?

[20:57:46] <TecnoBrat> mw44118: I believe you can manually construct those queries, you have to check the docs

[20:58:14] <mw44118> ok, thanks

[21:02:23] <TecnoBrat> so ... it aggregated 12916016 documents in 170 seconds

[21:02:33] <TecnoBrat> thats not terrible, and this box has terrible disks

[21:05:16] <AlbireoX> Hi, is it faster to get documents by their ObjectID or could I use my own unique string and have it be the same speed?

[21:06:29] <ranman> TecnoBrat: you should see it run on SSDs, 128gigs of ddr3, etc. etc.

[21:08:54] <TecnoBrat> ranman: oh rly?

[21:09:20] <TecnoBrat> its still not as fast as a columnar, they were like 25 seconds for the same thing

[21:49:28] <Zelest> Anyone around at this hour? :-)

[21:57:55] <javvierHund> no

[21:58:30] <Zelest> :-(

Log file Viewer

Help | Karma | Search:

#mongodb logs for Friday the 13th of July, 2012