PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 26th of July, 2012

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:12:24] <dstorrs> got a weird one here. if we do NOT index our jobs collection, we end up correctly processing all 27,000 jobs. if we DO index it, we end up correctly processing between 9,000 and 13,000
[00:12:39] <dstorrs> any idea what might cause this?
[00:14:30] <dstorrs> it's a pretty big issue -- runtime is 5 minutes with the index and 30 without, so it's a huge impact on our other systems.
[00:26:49] <dstorrs> <cricket, cricket>
[01:07:39] <krz> any help would be appreciated: http://stackoverflow.com/questions/11660931/how-do-i-group-by-id-and-country
[02:54:53] <sent-hil> how fast is mongodb's find when using time based selectors compared to regular find with string?
[02:56:16] <sent-hil> i've been working with relatively small amount of data (<100k records) and I don't see much of a diff.
[03:51:55] <vsmatck> Time in bson is int64.
[03:53:46] <sent-hil> vsmatck: is that a problem?
[04:21:00] <at133> Hi, I have a map reduce query that all the sudden has stopped running. It is complaining about str.replace not being a function. If I trim the code down to only include the replace it will run, but if I try do a parseInt() on it it will complain about replace. This was running for days with no problems. I've stopped and restarted mongodb, it's 1.8.3. Any suggestions?
[04:21:58] <nemothekid> Your datatypes could be messed up?
[04:22:31] <nemothekid> paste bin your map/reduce?
[04:32:39] <at133> nemothekid: http://pastebin.com/QVPkGdKh
[04:34:48] <nemothekid> in your reduce
[04:34:49] <nemothekid> var tmp_view = parseInt(values[0].view_count.replace(/\,/g, ''));
[04:34:59] <nemothekid> values[0].view_count is an integer
[04:35:03] <nemothekid> or number
[04:35:09] <nemothekid> rather
[04:36:12] <nemothekid> in your emit, remove the commas if you have to, then only deal with numbers in your reduce
[04:36:14] <at133> Thats what I thought, but I can run the replace on it without the parseInt, that works. Adding parseInt creates the replace not a function error.
[04:37:00] <nemothekid> my guess view_count must be a string then
[04:37:10] <nemothekid> is it a string in your database?
[04:37:34] <nemothekid> replace might be returning undef as well
[04:41:01] <at133> Yes, view_count is string, particularly values[0].view_count is a string because I've been echoing it out to confirm. Its "98,359". I can strip the comma out of it and print it as "98359" but parseInt of that causes mongo to return the replace is not a function nofile_b:0 error.
[04:41:42] <at133> I can't even do parseInt of a string that I define, like parseInt("123").
[04:42:32] <nemothekid> I would either 1.) convert it to a number. 2.) use parseInt in your emit
[04:46:28] <at133> I'll rework it then. Do you have any idea why it would stop working though?
[04:50:04] <nemothekid> were you actually getting correct results?
[04:54:12] <at133> Yes, for days it was running fine
[04:58:08] <jamiel> Hi all, is it possible to upsert but not only update my "inserted_at" field if it's a new record?
[04:58:21] <jamiel> s/not //
[05:00:33] <krz> anyone can help me with this: http://stackoverflow.com/questions/11660931/how-do-i-group-by-id-and-country
[05:25:42] <jamiel> nm, I'm going to ditch that column altogether and rely on the id from timestamp
[06:02:44] <nofxx> How would you model a "big queue"? Say like 10k users, need to some job on each, and keep that result for history.
[06:04:02] <nofxx> In order to make it scale nicely with parallel workers*
[06:56:18] <krz> any ideas: http://stackoverflow.com/questions/11660931/how-do-i-group-by-id-and-country
[07:26:09] <vsmatck> krz: Stop spamming.
[07:39:27] <[AD]Turbo> hola
[07:55:52] <krz> vsmatck: why is that spamming?
[07:56:10] <krz> I'm not doing it every few minutes am i
[08:00:12] <Shadda> The big thing that sticks out to me is that mongostat is reporting that it's consuming nearly 5gb of ram
[08:00:50] <Shadda> and not letting go, even when idle
[08:01:55] <deoxxa> Shadda: what's this?
[08:02:03] <Shadda> hm?
[08:02:08] <deoxxa> ram usage?
[08:02:19] <Shadda> ah
[08:02:19] <Shadda> sec
[08:02:26] <Shadda> 1:00 AM
[08:02:27] <Shadda> The big thing that sticks out to me is that mongostat is reporting that it's consuming nearly 5gb of ram
[08:02:38] <Shadda> bleh don't think my client sent that
[08:02:51] <deoxxa> oh
[08:03:10] <deoxxa> what reading is that?
[08:03:15] <deoxxa> vsize?
[08:03:17] <Shadda> k i've no idea if those are sending but I won't spam the channel any longer with it
[08:03:51] <Shadda> I'm looking at mongostat, and after a clean reboot, vsize starts off at 700m
[08:04:13] <deoxxa> vsize is just the virtual memory size - the size of all the data mongo has mmap()'d and such
[08:04:16] <Shadda> but after about 10 seconds of my code doing any work it jolts up to 5gb
[08:04:21] <deoxxa> that's fine
[08:04:30] <Shadda> the real issue though is the cpu usage
[08:04:42] <deoxxa> that's mongo mapping files into its address space - it doesn't actually use that much ram at one time
[08:05:07] <Shadda> it wasn't doing this until today, but looking at `top` mongo shows 768% cpu
[08:05:38] <Shadda> http://cl.ly/image/281H2H19200E
[08:05:47] <deoxxa> eep
[08:05:49] <Shadda> the blue is my code, the pink is mongo
[08:06:00] <deoxxa> that definitely sounds fishy
[08:06:12] <Shadda> and I'm not hammering it much harder than I have been for the last 2 months
[08:06:13] <Shadda> I dunno
[08:06:24] <deoxxa> could it be that you've recently crossed a threshold where it does a lot more swapping?
[08:06:29] <deoxxa> do you have a continually growing dataset?
[08:06:37] <Shadda> no, the data set is more or less static
[08:06:41] <Shadda> changes once a month if at all
[08:06:43] <deoxxa> that's really odd then...
[08:07:15] <Shadda> is there some one I can figure out if I'm just missing an index somewhere?
[08:07:21] <Shadda> some way*
[08:07:33] <deoxxa> yeah, take one of your queries and run .explain() on it
[08:07:46] <deoxxa> i.e. db.some_collection.find({herp: "derp"}).explain()
[08:08:03] <Shadda> mk
[08:28:30] <ankakusu> hey guys!
[09:21:43] <fredix> hi
[09:22:54] <fredix> I have a problem with 2 threads connected on mongodb, when the first thread write datas, the second cannot read datas
[09:23:26] <fredix> anyway I have set this options on mongodb connection : BSON("getlasterror" << 1 << "j" << true)
[09:38:33] <NodeX> cannot read while writing or cannot read at all?
[09:44:08] <algernon> fredix: are you usiing two threads with a single connection, or two distinct connections?
[09:44:14] <algernon> fredix: the first won't work.
[09:44:34] <fredix> two distinct
[09:45:33] <NodeX> [10:38:37] <NodeX> cannot read while writing or cannot read at all?
[09:45:34] <fredix> algernon: I create two instance of nosql : https://github.com/nodecast/ncs/blob/master/nosql.cpp#L69
[09:50:58] <solars> hey, is there any way I can speed up this count query: https://gist.github.com/7dd239e41ec262257fdd - I'm not sure if the index is used also for count, is there a way to explain a count query?
[09:57:41] <krz> any ideas: http://stackoverflow.com/questions/11660931/how-do-i-group-by-id-and-country
[09:59:50] <remonvv> Is anyone aware of a method to force a mongos process to refresh cluster metadata from config servers without restarting mongos?
[10:00:06] <NodeX> solars : count() is expensive on queried data
[10:00:56] <solars> hm yeah, I just found https://jira.mongodb.org/browse/SERVER-1752
[10:01:00] <solars> :(
[10:02:58] <Rozza> krz: answer coming up
[10:04:06] <krz> Rozza: thanks
[10:05:16] <solars> NodeX, the queries are used for pagination, so its an unnecessary slowdown.. I guess I will just cache the count then
[10:07:28] <NodeX> that tends to be the accepted solution at the moment solars
[10:08:07] <solars> alright, thanks a lot!
[10:15:31] <Rozza> krz: answered
[10:15:51] <krz> Rozza: looking
[10:16:38] <krz> Rozza: testing it out, will let ya know in a bit
[10:16:50] <Rozza> cool
[10:17:03] <Rozza> I've not tried a computed group before :)
[10:19:11] <krz> Rozza: nah still don't work
[10:19:25] <Rozza> ok go with plan 2
[10:21:19] <Rozza> kHz let me go check this first
[10:21:59] <krz> ok
[10:28:00] <Rozza> ok kHz - I'm missing something the computed group worked for me in my test
[10:28:05] <Rozza> what do you need?
[10:28:36] <krz> Rozza: i just need it to group by id and countries. and a country count
[10:28:56] <krz> disregard everything else. ill do the sorting later
[10:28:59] <Rozza> get a country count per id + country?
[10:29:33] <krz> yea. so for example: "_id" => "20120726/foobar/song/custom-cred",
[10:29:34] <Rozza> sorry visits count per id+country
[10:29:40] <krz> will have UK => 2
[10:29:43] <krz> US => 1
[10:29:46] <Rozza> ah lol
[10:29:57] <krz> "_id" => "20120725/foobar/song/test-pg3-long-title-here-test-lorem-ipsum-dolor-lo", will have
[10:29:57] <Rozza> ok we nearly have that
[10:30:00] <krz> UK => 1
[10:35:59] <Rozza> ok - dynamically projecting a fields isn't going to work
[10:36:07] <Rozza> but you could push to a list
[10:36:25] <Rozza> eg: counts: [{uk: 1}, {france: 2}…]
[10:36:42] <krz> ok we can try that. how do we go bout that?
[10:44:18] <krz> Rozza: any idea?
[10:44:27] <Rozza> on it
[10:52:24] <Habitual> I am in need of some guidance for your package, the "mongo guy" (our client) is having one or more of "network latency, network limits, disk io speed, io wait" issues. I have been beating up sysstat tools all day. I have added counts for mysqd and mongod processes in zabbix. I really need to identify the cause. Thank you.
[10:53:06] <NodeX> what sort of guidance?
[10:54:06] <Habitual> Well, where the issue lies mostly, he believes his setup "may" be not optimally configured. We're (I'm) looking for a CPU +IO spike of ~7 seconds.
[10:55:04] <Habitual> sar runs ever 10m and zabbix queries count.{mongod,mysqld} run every minute.
[10:56:14] <Habitual> some stats about his env. may be help you help me?
[10:56:39] <krz> Rozza: hows it going?
[10:56:48] <NodeX> what's zabbix ?
[10:57:10] <Habitual> Fault Tolerance Reporting and Monitoring.
[10:57:27] <Habitual> a la Cacti/Nagios/...
[10:57:41] <Habitual> add Icinga = zabbix.
[10:57:51] <NodeX> what is he trying to do that's thrashing ?
[10:58:34] <Habitual> dunno exactly, it's a rather general complaint. He relies on the visuals in htop for this compaint.
[10:59:07] <NodeX> I would ask him to use explain() and see what is taking so long to make the CPU spike
[11:00:21] <NodeX> it could be any number of things, a bad / lack of index perhaps, the way he is querying maybe
[11:00:28] <kali> Habitual: you're aware of the existence of mongostat to look for internal stats ?
[11:00:28] <remonvv> Anyone know why this is thrown? "shard version not ok in Client::Context: client in sharded mode, but doesn't have version set for this collectio...", code: 13388, ok: 0.0 }, ok: 0.0, errmsg: "should be sharded now"
[11:00:32] <Habitual> He won't or hasn't run or offered any speculation about what I see in htop -s PERCENT_CPU -u mysql or htop -s PERCENT_CPU -u mongodb and I have to ask why would mongostat be disabled? For a mongo guy, that seems counter-intuitive.
[11:00:59] <Habitual> kali: yes, thanks. He's referred to that tool.
[11:01:33] <NodeX> what does mysql have to do with this ?
[11:02:12] <Habitual> a 49G db in my experience usually is in need of help. He's using MySQL.
[11:02:46] <NodeX> how big is the mongo db?
[11:02:51] <Habitual> but this is my first exposure to mongodb, so I may be talking out of my @ss.
[11:02:54] <NodeX> how much of it is in RAM ?
[11:03:08] <Habitual> NodeX: how to tell size?
[11:03:27] <NodeX> http://www.mongodb.org/display/DOCS/Monitoring+and+Diagnostics
[11:04:01] <Habitual> mongostat says VSIZE is 7.4g
[11:04:17] <NodeX> how much ram does the system have
[11:05:34] <Habitual> free -tom : Total: 70018 61101 8916
[11:07:15] <Habitual> now that's new/odd.
[11:08:11] <Habitual> typed mongod and it barfed ... Thu Jul 26 04:07:00 [initandlisten] closeAllFiles() finished...
[11:08:54] <NodeX> already running?
[11:09:06] <kali> Habitual: mongod is the daemon, it's mongo you want to run in command line
[11:09:08] <Rozza> kHz can't be done
[11:09:14] <Rozza> krz: can't be done
[11:09:28] <Habitual> thanks.
[11:10:02] <Habitual> db.serverStatus().mem
[11:10:02] <Habitual> {
[11:10:02] <Habitual> "bits" : 64,
[11:10:03] <Habitual> "resident" : 1261,
[11:10:03] <Habitual> "virtual" : 7594,
[11:10:03] <Habitual> "supported" : true,
[11:10:05] <Habitual> "mapped" : 3312, - help?
[11:10:11] <NodeX> pastebin
[11:10:24] <Habitual> he did turn off journaling. PB Noted!
[11:10:32] <Habitual> sorry. :)
[11:10:36] <krz> Rozza: damn….
[11:10:49] <krz> Rozza: but isn't the aggregation framework meant for that?
[11:11:19] <kali> Habitual: have you guys checked out for non-optimum queries in the logs ?
[11:11:20] <Rozza> sure - but the issue here is dynamically creating fieldnames
[11:11:27] <Rozza> which you can't do
[11:11:39] <Rozza> not based on values from aggregations
[11:11:42] <kali> Habitual: also, you want to look for line talking about file preallocation in the logs
[11:11:59] <Habitual> okies.
[11:12:03] <krz> but if i can do it with visits should i be able to do it with countries?
[11:12:09] <Rozza> I can do the counting fine - it does in the example its the collapsing into a single doc thats not working
[11:12:44] <krz> how are you doing the counting?
[11:12:47] <krz> can i see the code
[11:13:08] <Rozza> I posted the answer on SO
[11:13:45] <Rozza> as you are grouping by country and id
[11:14:21] <krz> is there an alternative approach?
[11:14:30] <krz> changed my doc structure?
[11:22:18] <krz> Rozza: your answer does not group by country though
[11:22:31] <Rozza> it groups by id and country as asked
[11:23:04] <Rozza> :_id => { '$add' => ['$_id', '$visits.country_name']},
[11:26:15] <krz> ah i see what you mean. is it not possible to the countries listed in the ids?
[11:26:26] <krz> right now I'm seeing duplicated ids in the results
[11:26:31] <krz> result*
[11:27:50] <Habitual> kali or NodeX can you peek at http://pastie.org/4335888 - log digging output. agent.log has 128 Traceback entries. Thanks.
[11:27:51] <krz> Rozza: nigel suggests this: "I'd try grouping by both _id and country first (letting you do the count you want), then group the result just by _id to give the structure you want."
[11:28:29] <krz> does he mean throw in a second group in the pipeline?
[11:31:15] <Rozza> krz ok have a solution
[11:31:20] <Rozza> update incoming
[11:32:45] <krz> Rozza: ok lets change things a bit first
[11:32:59] <krz> i realize i don't need to group by id. just by country
[11:33:12] <Rozza> really?
[11:33:27] <Rozza> ha well I'll answer the original question as its better
[11:33:55] <Rozza> actually - ok update and I'll review after lunch
[11:34:15] <kali> Habitual: mms is the mongo monitoring agent. it push monitoring data on a server belonging to 10gen somewhere in the cloud. I'm not using it, so I can't say if this is expected bahaviour
[11:34:37] <Habitual> well, he has no complaints about that service. :)
[11:35:07] <Habitual> kali: are you suggesting the push to MMS service may be the issue?
[11:35:19] <kali> as for the DFM::findAll stuff, it's nothing.
[11:35:19] <Habitual> push or pull.
[11:35:24] <Habitual> k
[11:37:40] <krz> Rozza: your answer at SO is right for grouping id and country. ill select it. but now i need to only group by country
[11:38:16] <kali> Habitual: file allocation lines look like that: 2012-07-02_05:20:17.74193 Mon Jul 2 05:20:17 [FileAllocator] allocating new datafile /xxx/yyy/zzzz, filling with zeroes...
[11:38:34] <Habitual> okies. :)
[11:39:09] <Habitual> none of those that I saw. but the mnt/tmp/logs/mongodb/mongodb.log file is zero bytes, so I zgrep'd :)
[11:39:25] <Habitual> looking for the obvious. :(
[11:39:49] <kali> Habitual: then grep for "nscanned:" to look for slow queries
[11:40:59] <Habitual> kali: it lit up like Xmas.
[11:41:16] <kali> Habitual: which one ? :)
[11:41:44] <Habitual> zgrep nscanned /mnt/tmp/logs/mongodb/*
[11:42:19] <kali> well, you get one line for each slow query (slow being 100ms+)
[11:42:27] <Habitual> FileAllocator hit too.
[11:42:52] <Habitual> you want a pastebin sample?
[11:42:57] <kali> why not
[11:44:47] <Habitual> refresh http://pastie.org/4335888 please. :)
[11:46:05] <Habitual> nscanned hits there too now.
[11:49:46] <kali> Habitual: for file allocation, you actualy want the "other" lines ("done allocating datafile") they say how much time it took
[11:49:57] <Habitual> k
[11:50:01] <Habitual> stand by.
[11:50:32] <kali> Habitual: as for the queries, the nscan is not that high, so these queries are not lacking an index
[11:50:45] <Habitual> that a good thing? :)
[11:50:53] <Habitual> sounds about right.
[11:51:47] <kali> Habitual: it's a good thing as in "your developper knowns what he's doing" and a bad as in "we haven't found what we're looking for" :)
[11:52:12] <Habitual> my phone isn't ringing kind of Good.
[11:53:05] <kali> exactly
[11:53:40] <Habitual> kali: refresh. :)
[11:54:18] <kali> same kind of good.
[11:55:43] <kali> Habitual: i think the next thing to do is run a db.currentOp() in the shell during one of these spikes
[11:55:57] <Habitual> and look for ...?
[11:56:17] <kali> mmm db.eval, map reduce, queries with $where
[11:57:29] <Habitual> okies.
[11:57:35] <kali> or anything fishy
[11:58:13] <Habitual> is there a mechnism or c-li switch that can dump out result of db.currentOp() ?
[11:58:58] <kali> Habitual: mongo --eval 'printjson(db.currentOp())'
[11:59:12] <Habitual> rockin'!
[11:59:44] <Habitual> { "inprog" : [ ] } is all I see atm.
[12:00:08] <Habitual> but I will instruct the client. Thank you all very much for your asistance.
[12:02:05] <Habitual> assistance too. I Kan Sphell. ;)
[12:03:06] <NeoNmaNDK> If I must have "replication" set up on my Mongo server and put shared datasheed on, there are some that can send guides to this? I use CentOS 6.2 today.
[12:03:30] <kali> Habitual: u're welcome
[12:34:18] <Rozza> krz: back
[12:34:30] <Rozza> so did my answer answer the question asked?
[12:34:49] <Rozza> I'm not sure on the new question
[12:34:55] <Rozza> whats the url?
[12:43:09] <DinMamma> Hiya.
[12:43:20] <Rozza> hi DinMamma
[12:43:43] <DinMamma> I have a mongo instance that is reported as down in the replica-set. I have a SSH session alive(If I try to open a new one it times out).
[12:44:03] <DinMamma> The load on the machine is around 35 and I did a service mongodb stop but its not terminating..
[12:44:10] <DinMamma> Is it safe to do a restart on the machine?
[12:44:34] <Rozza> DinMamma: do you have journalling on?
[12:44:41] <DinMamma> To clarify "sudo service mongodb stop -> cursor is not returning".. Its been locked up for around 5 minutes.
[12:44:45] <DinMamma> Rozza: yes.
[12:45:42] <Rozza> ok -and there is a new primary node now?
[12:45:56] <DinMamma> There sure is.
[12:46:54] <Rozza> from a mongodb point of view a reboot should be fine. If its within the window of the oplog the node should auto recover
[12:47:22] <DinMamma> Ok, cool. Thanks
[12:52:10] <DinMamma> Rozza: A restart of the machine brouch mongo back as primary. Groovy!
[12:54:13] <DinMamma> Before the machine died i did this query "db.database.collection({}, fields=({'product_hash':1, '_id':0})" In the collection there is ~80 million documents and there is a index on product_hash.
[12:54:20] <adamcom> when I see questions like DinMamma's I am so happy that journaling is default on in 2.0+ :)
[12:55:12] <DinMamma> Saved my day, thats for sure.
[12:55:26] <adamcom> Rozza, if you are scanning every record (no criteria on product_hash) then it's still a table scan
[12:56:01] <adamcom> oops, sorry - meant to direct that at DinMamma
[12:56:02] <adamcom> :)
[12:56:08] <Rozza> tell DinMamma not me :P
[12:56:38] <adamcom> :)
[12:56:39] <DinMamma> Haha, thats funny. "Din Mamma" is "your mother" is swedish..
[12:56:59] <DinMamma> Anywho, thanks for clearing that up for me! :)
[12:57:04] <adamcom> so……there are DinMamma jokes in Sweden?
[12:58:45] <DinMamma> Why, yes of course!
[13:00:16] <Derick> DinMamma: tullefant :P
[13:01:26] <adamcom> DinMamma: :) to expand a bit on my previous statement, if you use an index for a query that has to touch all (or nearly all) the documents in a collection, it can actually be slower than not using an index - the index has to be loaded into memory first, then the document, so it's more inefficient - of course, if everything is in memory already then the difference will be minor - if things have to be loaded from disk then the index scan + table scan is
[13:07:09] <DinMamma> adamcom: you cut off there but if I get you correctly it would be better to just db.col.find({}) rather than db.col.find({}, fields=....)?
[13:17:28] <adamcom> sorry, distracted - no, as I read your query (it's not shell format, and I don't recognize the driver language), the fields are just a projection, i.e. that's what you want returned
[13:18:19] <adamcom> my point was, that, in the case where you are going to have to touch all the records in a collection, using an index will be less efficient
[13:18:38] <adamcom> I'll repost my original in two parts, so it doesn't get cut off.....
[13:18:51] <adamcom> to expand a bit on my previous statement, if you use an index for a query that has to touch all (or nearly all) the documents in a collection, it can actually be slower than not using an index - the index has to be loaded into memory first, then the document
[13:19:07] <adamcom> hence it's more inefficient - of course, if everything is in memory already then the difference will be minor - if things have to be loaded from disk then the index scan + table scan is going to be slower
[13:19:34] <adamcom> DinMamma: all a general statement, not specific to your query
[13:19:47] <DinMamma> adamcom: Its python. From what I understood this will only look in memory so affectivly that would mean indexOnly: True?
[13:29:39] <adamcom> indexOnly: True is indicated in an explain when your entire query can be returned from the index (without touching the document itself) - the full index, in this case the one on product_hash will:
[13:29:47] <adamcom> 1. have to be in memory already
[13:29:54] <adamcom> 2. used for the query
[13:30:29] <adamcom> I'd have to check if the index would be used for an empty criteria field, my gut says yes, but I'd have to check
[13:31:10] <adamcom> or, if you use explain() you can see for yourself - if IndexOnly : true is there and it's using the Btree cursor, then you are good
[13:32:24] <adamcom> note: the index doesn't have to all be in memory for it to be an indexOnly : true (i.e. covered index) query, but that's what you want for it to be nice and fast
[13:32:58] <adamcom> get the host into MMS and watch page faults and IO (if you install munin) to get a feel for it over time
[13:35:19] <joeritchey> I am having a strange issue when I try to add a new member to a replica set
[13:36:46] <adamcom> joeritchey: paste message to pastie (or similar) and I'll take a look
[13:37:00] <adamcom> assuming its long - one/two lines is OK here
[13:37:38] <joeritchey> The sync seems to finish but new server is showing less than 1/2 of the size of the original database
[13:38:06] <joeritchey> in a rs.status() the new server went from RECOVERY to SECONARY
[13:38:17] <joeritchey> but the oplog is showing 24hrs behind
[13:38:36] <joeritchey> No error messages are showing up
[13:38:57] <hashpuppy> was hoping scott was in here. do you guys know the state of morphia? i'm still on 1.6.3 and an considering upgrading. will morphia work with 2.0?
[13:47:17] <hashpuppy> there he is. skot, i just asked this moments before you joined: hashpuppy: was hoping scott was in here. do you guys know the state of morphia? i'm still on 1.6.3 and an considering upgrading. will morphia work with 2.0?
[13:57:26] <ankakusu> hi all!
[13:57:48] <adamcom> joeritchey: is the host in MMS? (looking to see what the repl ops counters are like)
[13:59:19] <adamcom> and the smaller data size is normal - when you do an initial sync of a secondary, it writes the data files from scratch on the new secondary - all padding, empty space is removed
[13:59:57] <joeritchey> The new server is not in MMS. I will add it now
[14:00:04] <adamcom> hashpuppy: AFAIK morphia will work just fine with 2.0 but I don't know if there were any breaking changes off the top of my head
[14:00:21] <hashpuppy> adamcom: thanks
[14:00:22] <adamcom> as you say, skot knows all there :)
[14:01:21] <joeritchey> I understand the differences in padding. The original database is 17GB and on the new secondary it is showing 3.9GB
[14:01:35] <joeritchey> I thought 14GB of padding was a bit much
[14:42:40] <ankakusu> Derick, why didn't you converted the timestamp value into mongodb?
[14:48:21] <clone1018> I'm having quite a weird issue with MongoDB and it's PHP driver, basically I've enabled auth=true in the mongodb config file, and verified my script can login using the user information I selected, but you're still able to make a connection and read data using without using a user on localhost
[14:49:50] <algernon> you also need to tell in the server config to do auth.
[14:50:08] <clone1018> algernon: in the mongodb config file?
[14:50:38] <algernon> auth = true
[14:50:43] <clone1018> Ya, that's set.
[14:50:55] <NodeX> and restarted ?
[14:50:59] <clone1018> Yes.
[14:51:10] <clone1018> And I've tried starting it manually and using --auth
[14:52:45] <NodeX> are you using a socket?
[14:52:51] <NodeX> (unix socket)
[14:53:16] <clone1018> No
[14:53:22] <clone1018> Well, maybe
[14:53:40] <clone1018> Depends on if $m = new Mongo(); does a socket request or not
[14:54:49] <NodeX> not unless you specify it
[14:55:04] <NodeX> are you using persistent connections?
[14:55:09] <clone1018> No then. But I need sockets to be authed too.
[14:55:14] <clone1018> I haven't specified that no.
[14:56:13] <NodeX> are you authing on a per database level?
[14:56:29] <clone1018> Pretty much
[14:56:46] <clone1018> Would it be a problem if I had a database without any users?
[14:58:00] <NodeX> if i recall correctly you must have one user in your user collection
[14:58:08] <NodeX> at least one user*
[14:58:51] <NodeX> "To enable secure mode, run the mongod process with the --auth option (or --keyFile for replica sets and sharding). You must either (1) have added a user to the admin db before starting the server with -auth, or (2) add the first user from a localhost connection (you cannot add the first user from a connection that is not local with respect to the mongod process)."
[14:59:09] <NodeX> part (2)
[14:59:41] <clone1018> There's a user in the admin database
[15:00:30] <NodeX> which Mongo version ?
[15:01:07] <clone1018> 2.0.6
[15:06:04] <benpro> Hi there, to secure a pool of replicaset, is there any other way than a VPN ?
[15:10:14] <clone1018> NodeX: any idea?
[15:27:24] <sspy> I use MongoVUE to manage my DB and I see 2 columns - Size(40.8 MB) and Storage(51.3 MB). Which one handles my data and what is the purpose of another ?
[15:29:47] <sspy> any thoughts ?
[15:30:26] <linsys> sspy: one is the actual size on disk (storage) and the other is the actual size of your database
[15:31:30] <clone1018> I'm having quite a weird issue with MongoDB and it's PHP driver, basically I've enabled auth=true in the mongodb config file, and verified my script can login using the user information I selected, but you're still able to make a connection and read data using without using a user on localhost
[15:31:52] <linsys> clone1018: did you start mongodb with --auth
[15:31:57] <clone1018> I tried that yes
[15:32:10] <linsys> what data can you read?
[15:32:21] <clone1018> Everything
[15:32:30] <clone1018> Probably can write it too
[15:32:37] <linsys> then your doing something wrong
[15:32:40] <sspy> it's good, I thought they sum
[15:33:14] <clone1018> linsys: I realize that.
[15:34:20] <clone1018> linsys: any idea what it could be?
[15:50:29] <sspy> comparing to mysql how much more space is used by mongo ?
[15:52:16] <balboah> a lot
[15:53:11] <sspy> even without taking into account preallocation ?
[15:55:30] <balboah> yup. Maybe times 4-5, depends on how your setup looks like. Hard to translate 1 to 1
[15:56:55] <balboah> sspy: http://www.wikivs.com/wiki/MySQL_vs_MongoDB says between 2-10. Looks just about right as I have a version of the same data in both mongo and mysql
[15:59:34] <sspy> bad, space is very critical to me
[16:06:19] <balboah> then mongo is not for you :)
[16:06:57] <balboah> space is cheap
[16:07:57] <sspy> my vps provider doesn't think so :)
[16:09:19] <TkTech> sspy: VPS + MongoDB = </3
[16:40:28] <pdtpatrick> Question - i'm trying to make a node in a replica set arbiter only, but keep getting the following: http://pastie.org/private/chjsgfd5jdwaz1cjt9hpw
[16:46:43] <pdtpatrick> ahh nvm - i fixed it
[16:46:51] <joeritchey> You could try removing the node and then add it back awith rs.addArb
[17:22:27] <remonvv> exit
[17:22:31] <remonvv> oops
[17:29:19] <pdtpatrick> can u bing mongo on multiple ports? for instance, localhost and also eth0? right now i have it on eth0 and i can only connect on that nic and cannot connect via socket to localhost
[17:30:44] <jY> i would just use a bind_ip of 127.0.0.1
[17:30:52] <jY> and connect on that and not localhost
[17:31:16] <pdtpatrick> right but how can i keep eth0 active cuz i need to bind to it from another host
[17:31:26] <jY> what?
[17:32:59] <pdtpatrick> nvm - i've been up wayy too long. Thanks
[18:00:04] <Rhaven> Hi all, i'm currently stuck with some errors related to mongodb using replica sets and sharding.. Is there someone to help me?
[18:01:59] <joeritchey> I can try what errors are you getting
[18:02:19] <Rhaven> Hi joeritchey
[18:02:55] <joeritchey> Hi
[18:03:20] <joeritchey> I have been working through a few issue myself so I have been reading up the last few days
[18:04:13] <Rhaven> i'm actually running with 3 configs server, and 3 mongos and 1 mongos is stuck with "scoped connection to [xxxx] not being returned to the pool
[18:04:31] <Rhaven> 2 others mongos are fine
[18:05:09] <Rhaven> and there is no connection problem between config server and mongos
[18:05:45] <Rhaven> i can access all config server & mongos from anywhere
[18:05:56] <joeritchey> are all of your servers syncd to the same time
[18:06:01] <Rhaven> yeah
[18:06:51] <Rhaven> i run ntpdate to check and it is the same
[18:08:40] <Rhaven> that's weird
[18:08:44] <Rhaven> :/
[18:09:28] <joeritchey> are you seeing the error in the mongos log or is the driver returning a error?
[18:09:39] <Rhaven> mongos log
[18:12:31] <Rhaven> but around 1/30 times the server ping successfully all config server and the next 10s after he say the same error
[18:12:58] <joeritchey> so the server having the problem can ping the hostname or ip address of each server as it appears in config servers
[18:13:53] <joeritchey> if you just run ping for a few minutes to the config servers does it respond with 100% success
[18:14:37] <Rhaven> let me see
[18:19:04] <Rhaven> I ping all servers successfully without packet loss
[18:19:12] <Rhaven> *config servers
[18:24:38] <joeritchey> Hmmmm that is weird
[18:24:54] <Rhaven> yes :/
[18:29:17] <Rhaven> thank you anyway :)
[18:34:10] <joeritchey> are all of the servers on the same network?
[18:57:44] <pdtpatrick> Question - i've installed Edda and ran "edda /var/log/mongodb/mongodb.log" -- however the page that comes up on port 28000. Show just the frame and no data
[18:58:35] <pdtpatrick> http://screencast.com/t/IrvCWkIs
[18:58:38] <pdtpatrick> that's what i get
[19:15:56] <JoeyJoeJo> How can I delete an entire collection?
[19:22:20] <idefine> JoeyJoeJo: i believe you drop it
[19:23:33] <idefine> JoeyJoeJo: http://www.mongodb.org/display/DOCS/Overview+-+The+MongoDB+Interactive+Shell#Overview-TheMongoDBInteractiveShell-Deleting
[19:40:22] <verrochio> hello everyone
[19:40:44] <verrochio> i installed mongodb on debian lenny from 10gen deb source
[19:41:01] <verrochio> when i try to save a document within mongo shell i get _filename.empty()
[19:41:15] <verrochio> error
[19:45:23] <verrochio> anyone know anything about "
[19:45:24] <verrochio> _filename.empty()
[19:45:24] <verrochio> " error
[19:46:38] <verrochio> anyone?
[19:58:11] <verrochio> HEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEELP
[20:08:50] <tphummel> hello folks. i'm trying to shard the resulting documents of a Map/Reduce job using "out: {sharded: true}". using both 2.0.x and 2.2 i get an error at the end of the job. curious if anyone else has seen this error: err { [MongoError: MR post processing failed: { errmsg: "exception: unknown out specifier [sharded]", code: 13522, ok: 0.0 }]
[20:08:51] <tphummel> name: 'MongoError',
[20:08:51] <tphummel> ok: 0,
[20:08:51] <tphummel> errmsg: 'MR post processing failed: { errmsg: "exception: unknown out specifier [sharded]", code: 13522, ok: 0.0 }' }
[20:11:05] <madrigal> verrochio: did you try increasing open files limit?
[20:11:13] <ProLoser|Work> i'm on mac, how can i figure out what the path is to mongod?
[20:11:24] <ProLoser|Work> i'm trying to run a script that says it can't find the command
[20:12:31] <tphummel> it appears that it doesn't like the key "sharded" in the out object. i'm wondering if is that necessary anymore
[20:12:42] <verrochio> i just restarted
[20:12:47] <verrochio> the server now it works
[20:12:53] <verrochio> thanks anyway
[20:29:41] <nofxxxx> Wow, 2.2.rc0... finally, mongodb is the only piece of my stack w/o remote syslog
[20:45:24] <krispyjala> hey guys, trying to do a group by, and it works with this command: db.discourses.group({key: {name:true}, cond: {evaluation:"OK", vlifecycle: {$ne:"retire"}}, reduce: function(obj,prev) {prev.csum++; }, initial:{csum:0}});
[20:45:36] <krispyjala> but now how do I tell it to show me only if csum is > 1?
[21:12:20] <dstorrs> anyone here use the Perl language driver? how do I rename a collection using it?
[21:19:30] <tiripamwe> is it possible to match against regular expressions from the python client?
[21:20:34] <nemothekid> dstorrs: $db->run_command({renameCollection => "db.collection", to => "db.newcollection"});
[21:21:44] <dstorrs> no joy. "No such command "to"
[21:23:05] <nemothekid> I guess you have to use a IxHash?
[21:23:50] <dstorrs> aha. say Dumper [ $db->run_command(Tie::IxHash->new(renameCollection => "dstorrs.foo", to => "dstorrs.bar" )) ];
[21:24:06] <dstorrs> heh, ninja'd. thanks nemothekid
[21:25:15] <krispyjala> hey nemothekid do you know how to filter group by results?
[21:25:33] <dstorrs> tiripamwe: in the Perl client it would be like so: $db->coll->find({ field => { '$regex' => /foobarbz/ } })
[21:25:45] <dstorrs> python probably allows something similar
[21:25:54] <krispyjala> i ran this and it was fine db.discourses.group({key: {name:true}, cond: {evaluation:"OK", vlifecycle: {$ne:"retire"}}, reduce: function(obj,prev) {prev.csum++; }, initial:{csum:0}});
[21:26:03] <krispyjala> but now I want to filter only results with csum > 1
[21:26:21] <tiripamwe> nevermind... you can, thanks
[21:26:41] <dstorrs> tiripamwe: what did you do? add it as an extra clause in the cond?
[21:27:54] <tiripamwe> dstorrs: thanks a lot, it looks like this in python db.coll.find( { field : re.compile('foobarbz') } )
[21:29:05] <dstorrs> huh. So python broke with the mongo spec, interesting.
[21:29:09] <dstorrs> cool, glad you found it.
[21:32:29] <nemothekid> dstorrs: both work in both languages actually. The former is more powerful because you can do things like find({field : {$not : re.compile('foobar')}, otherwise you couldn't do a not on a regex
[21:32:39] <nemothekid> krispyjala: I think group returns an array so you can just filter it yourself
[21:32:59] <nemothekid> otherwise you could always add a finalize function like if (scum == 0) return null
[21:33:15] <krispyjala> nemothekid: within mongo cmd line? oh hmm interesting let me try thx
[21:33:28] <dstorrs> nemothekid: well, unless you use a negated regex ("anything but" character class, negative lookahead, etc)
[21:33:37] <dstorrs> but, yes.
[21:33:44] <nemothekid> mo regex mo problems
[21:38:59] <dstorrs> yes, a lot of people who aren't comfortable with regexen think that.
[21:39:00] <dstorrs> ;>
[22:07:57] <tphummel> ahh. figured out my problem above, in which i was getting: err { [MongoError: MR post processing failed: { errmsg: "exception: unknown out specifier [sharded]", code: 13522, ok: 0.0 }]
[22:08:54] <tphummel> not so good with c++ but looks like the order is significant in pulling properties off the "out" object
[22:08:58] <tphummel> https://github.com/mongodb/mongo/blob/master/src/mongo/db/commands/mr.cpp#L243
[22:09:20] <tphummel> the order i was assigning values to out object was breaking
[22:10:05] <tphummel> out: { sharded: 1, reduce: "name"} failed while out: {reduce: "name", sharded: 1} works
[22:51:59] <dufflebunk> I have a simple set of data, but I need to perform a greatest not-less-than-X query on it. So I might search for "5" and the document it would return would have a key of "5" if it exists, or "6" if "5" doesn't exist, ...
[22:52:07] <dufflebunk> Is mongodb able to do this?
[23:19:22] <vsmatck> dufflebunk: You'd need a index on the integer with the right sort order. I haven't thought it through fully but I think you can do it.
[23:21:30] <dufflebunk> I was thinking of indexing the field, then find({field:{$gte:<some value>}}).sort().limit(1)
[23:22:20] <dufflebunk> but I wasn't sure if the db is smart enough to use the index for the sorting, and also smart enough not to sort the entire set of results when I only want the first result.
[23:22:58] <krz> given this structure. how can i include the country_name to the result set? https://gist.github.com/3161542
[23:23:09] <krz> with this query: https://gist.github.com/
[23:23:19] <krz> oops https://gist.github.com/3185209
[23:35:35] <krz> anyone?
[23:40:09] <dstorrs> krz: checking...
[23:41:48] <dstorrs> first of all, I recommend against this data structure.
[23:42:15] <dstorrs> embedded collections for something potentially unbounded like visits are a bad idea. They are hard to work with, and they risk blowing your 16M limit
[23:42:26] <dstorrs> particularly when the internal docs are relatively large, as these.
[23:43:09] <dstorrs> what exactly are you trying to achieve here? that query is too complex to understand easily
[23:48:31] <krz> dstorrs: trying to return something like this https://gist.github.com/3185279
[23:49:20] <krz> ideally it would be good to return the country name
[23:50:02] <krz> dstorrs: i.e. something like: https://gist.github.com/3185283
[23:50:21] <krz> line 4
[23:54:23] <krz> dstorrs: I'm using heroical documents because I'm following http://www.10gen.com/presentations/webinar/real-time-analytics-with-mongodb
[23:54:27] <krz> for real-time stats
[23:54:32] <krz> follows a similar structure
[23:55:10] <dstorrs> I'm not familiar enough with the pipeline / aggregation stuff to help, sorry.