pmxbot IRC Log Viewer

[06:09:39] <kennethd> I am trying to track down two pymongo OperationFailure errors being generated by cron scripts: "cursor id ... not valid at server" (pymongo 2.4.2, python 2.7.3, mongodb-linux-x86_64-2.2.3) + "getMore: cursorid not found" in mongodb.log, but all references I can find online point to a timeout issue after 10 mins & a) i see this error < minute into the program (even running script from command line) & b) find(timeout=False) within script d

[06:26:59] <crudson> kennethd: I use ruby rather than python, but the timeout will only work in ruby if invoked with a block rather than returning a cursor. Sorry I don't know the analog to that in python, but perhaps it could be something similar.

[06:37:54] <jsheely> Howdy

[06:38:48] <jsheely> Anyone around that may be able to answer a few questions? Specifically on running files

[08:33:10] <kennethd> crudson: hmm, don't know what that means, but i will sleep on it... thanks for the reply though!

[08:34:05] <crudson> kennethd: I looked a little, and it seems like that won't apply...perhaps the cursor is being closed elsewhere or falling out of scope.

[08:34:10] <crudson> not a python guy, sorry :(

[08:36:00] <kennethd> crudson: np, thanks for the thoughts

[09:32:37] <aandy> hi guys, i have a master-slave-slave setup, and wondering if i can config it to permently allow queries on secondary members? as in, on a db level, not cursor/lang

[09:34:25] <Nodex> the querying comes from the DSL

[09:37:43] <aandy> sorry, i'm not sure i follow. dsl in mongo?

[09:37:54] <Nodex> the connection string

[09:38:48] <aandy> okay, so i should always make connection instances slaveOk, it can't be done at a config level?

[09:44:28] <aandy> it'll make for an interesting loadbalancing exercise. would be trivial if it just had to select one good, but it has to be the PRIMARY, hehe. i'll put my thinking cap on. thanks, Nodex

[09:57:42] <derelm> hi, can i use replication to migrate a mongodb instance from one server to the other with almost no downtime? if so, how would i do that?

[10:03:15] <marcqualie> derelm: you would add the new server as a secondary then wait for it to catch up. Once it's caught up you would issue the stepdown command to the old server and this will make the new server primary. Once it's primary it will be safe to remove the old server

[10:03:31] <marcqualie> there will be about 10-20 seconds of downtime while the stepdown takes place but most drivers will handle this gracefully and retry

[10:03:50] <derelm> will i have to prepare config on my former server somehow to make it work?

[10:04:27] <marcqualie> no there shouldn't be any config, unless you have authentication. If you haven't already you will want to turn it into a replica set using rs.initiate()

[10:04:46] <marcqualie> http://docs.mongodb.org/manual/tutorial/convert-standalone-to-replica-set/

[10:05:02] <marcqualie> that should help you out with the precedure if it isn't already enabled for a replica set

[10:17:39] <Siyfion> say I have a collection containing documents in the format: { name: "foo", items: [ { value: 1.99 }, { value: 2.99 }, { value: 1.99 } ] } and I want to remove *all* the items-objects that have a value of "1.99" across all the documents in the collection, is there a way I can do this?

[10:19:51] <Siyfion> I kinda think that I need to use the update() syntax.. but I'm not 100% sure how to get it to remove the object from the array...

[10:23:00] <Nodex> update and pull

[10:23:03] <Nodex> $pull

[10:23:40] <Nodex> db.foo.update({},{$pull:{items:{value:1.99}}},{multi:true});

[10:26:54] <Siyfion> Awesome, cheers Nodex ;)

[10:27:36] <Siyfion> Hadn't got the right amount of nesting.. :\

[10:28:37] <Nodex> no probs

[10:29:12] <Siyfion> Nodex: Ah, it worked all bar the multi bit..

[10:29:28] <Siyfion> Nodex: It's worked on the first document, but not subsequent

[10:32:45] <Siyfion> and it seems to remove all occurances from the first document too, so that bit's working

[10:36:22] <Nodex> which version of mongo ?

[10:37:48] <Siyfion> db.foo.update({},{$pull:{items:{value:1.99}}},false,true);

[10:37:53] <Siyfion> That worked however. ;)

[10:38:16] <Nodex> you must have an old mongo version

[10:38:27] <Siyfion> I was testing using the mongodb shell

[10:38:31] <Siyfion> on their website

[10:38:37] <Siyfion> quite possibly!

[11:35:52] <avril14th> Hello, is it possible to query in mongodb for objects that have the latest timestamp?

[11:56:11] <Nodex> avril14th : just add a sort of _id : -1 to your query

[11:56:22] <Nodex> (assuming you use object ID's as _id)

[11:56:50] <avril14th> well, i need the latest timestamp for each unique criteria actually

[11:57:09] <Nodex> perhaps ask that as a question then :)

[11:57:15] <avril14th> like Object.where(:field.in => @array).only_latest_for_each_field_value

[11:57:22] <avril14th> yes sorry :)

[11:59:00] <avril14th> any clue for such a constraint?

[12:00:40] <Nodex> do you want unique documents ?

[12:00:53] <Nodex> I don't fully understand what you're after

[12:01:02] <avril14th> okay sorry

[12:01:16] <avril14th> my collection has two fields a timestamp an a non-unique id

[12:01:53] <avril14th> so I need to retrieve all the records within an array of ids, but for each id, i need the only record with the latest timestamp

[12:02:09] <avril14th> any clearer?

[12:13:32] <Nodex> you will have to map/reduce that

[12:14:05] <Nodex> or use distinct() with sort

[12:18:20] <avril14th> Nodex: ok will try out, thank you

[12:34:11] <unxmaal> hi all

[12:34:34] <unxmaal> what's the correct role for an admin-level user to be able to issue "rs.conf()" in an authenticated mongod instance?

[13:48:59] <aandy> hi. i'm getting: Assertion: 13111:wrong type for field (uid) 1 != 16, but how do i *force* a int32 assignment rather than double? i've tried: { $set: { uid: 1 } }, { $set: { uid: parseInt(1) } }, but they return double aswel o_O

[13:49:35] <aandy> the update is ok, as verified by setting it to, e.g. String, was no problem

[13:52:40] <aandy> bah, open bug since 1.8 :/

[14:02:23] <Nodex> wrong type?

[14:02:27] <Nodex> Mongo doesn't force types

[14:04:49] <aandy> well, then it defaults to double for ints, in shell at least

[14:05:33] <aandy> NumberInt(123) works, but it still maps parseInt to double

[14:05:46] <Nodex> that's javascript engine not mongo

[14:05:57] <Nodex> and on the shell you don't need to use "parseInt"

[14:06:11] <Nodex> unless you're passing in a string

[14:08:38] <aandy> ah, right, js engine, my bad. still had to explicity cast it though

[14:09:05] <aandy> { $set: { uid: 123 } } still remained as double, which i don't understand

[14:13:12] <Nodex> it shouldn't, what mongo / js version are you using?

[14:14:53] <Nodex> scratch that

[14:16:26] <Nodex> according to 10gen, all numbers are Doubles in MongoDB

[14:16:49] <Nodex> sorry, in Javascript shell

[14:17:57] <aandy> ah, okay. thanks

[14:18:10] <aandy> not a big deal, i was just curious :)

[14:18:51] <redsand> any reason why mongos wont accept new connections and closes the socket?

[14:19:00] <redsand> there are only 900 open connections from that pid

[14:19:05] <redsand> Centox 6 x86_64

[14:19:08] <redsand> 2.4.3

[14:19:28] <redsand> admin.$cmd query: { whatsmyuri: 1 } at src/mongo/shell/mongo.js:L114 exception: connect failed

[14:19:51] <redsand> we need to restart mongos to resolve the issue, right now. not a great situation

[14:22:55] <Nodex> that's a lot of connections

[14:26:30] <remonvv> \o

[14:37:21] <aandy> is there a specific reason why this: https://jira.mongodb.org/browse/SERVER-1594 isn't supported (yet), or is it a simple case of "haven't gotten around to it"? is it an uncommon scenario in mongo (as in, am i doing it wrong ;)) my case is: 3 boxes, 3 mongod (replica, no sharding), idea being: each box needs access to the same data. to provide failover in case either: 1. local db goes down, 2. primary goes down (3. remote slave/box goes down). election, fa

[14:38:02] <aandy> so what i do now is update the a loadbalancer whenever the primary changes (which is hackish)

[14:42:36] <remonvv> aandy, you can still do it, you just need to create a single shard.

[14:42:42] <remonvv> There's very little practical difference.

[14:43:50] <starfly> aandy: agree it would be good to have that feature, though

[14:44:04] <aandy> remonvv: great, i'll look into that, thanks

[14:44:26] <aandy> starfly: right, but not much activity on it yet, but at least it's planned and assigned :)

[14:45:32] <remonvv> "desired"

[14:45:39] <remonvv> ;)

[14:45:46] <aandy> better than "unwanted, closed" :p

[14:46:01] <remonvv> Only marginally so in my experience :p

[14:47:08] <Nodex> I am very surprised mongos didn't already do this

[14:47:31] <starfly> Nodex: agree, seems like a no-brainer

[14:47:52] <aandy> supposedly the logic needs to be ported (off of a comment in the jira history)

[15:21:23] <redsand> any ideas on debugging the reason why this value is returned: MONGO_BSON_NOT_FINISHED

[15:21:31] <redsand> bson_finish is definitely being called

[15:21:39] <redsand> is there some sort of character i should be escaping?

[16:01:14] <serafie> Hello! Will MongoDB run on PowerPC under LTIB in an embedded environment? Is it flexible enough to deal efficiently with very little resources on a very tiny dataset?

[16:01:42] <serafie> I'm looking to replace a very slow implementation of a k-v datastore which uses sqlite in the backend.

[16:03:59] <kali> serafie: well, it's definitely not what mongodb is optimised for

[16:04:28] <kali> serafie: mongodb likes big servers with buckets of RAM, and the feature set is much wider than k-v

[16:05:40] <kali> irk ! LTIB site uses comics sans serif

[16:06:44] <serafie> kali: I must not have comic sans installed. :D

[16:07:25] <kali> serafie: more seriously, first thing mongodb does when starting is allocate a few hundred MB on the disk, and mmap them. even with an empty dataset

[16:07:34] <kali> serafie: so.. embedding ? i'm not too sure :)

[16:08:15] <serafie> kali: yeaah ok. So know of any others that are extremely lightweight?

[16:12:00] <kali> well, i can google for you, but... sorry, emebedded is not really my field

[16:12:23] <serafie> no prob, I know it's OT. I'm searching. Thanks for your help!

[16:49:01] <jblack> sarafie: perhaps sqlite is a better fit for embedded?

[19:45:25] <astropriate> hello

[19:45:46] <astropriate> is it possible to pass values to the reduce function of map-reduce? i want to only get documents that meet a certain criteria. I am also referencing to to other documents using the _id field. is there some way to populate/join

[19:45:59] <astropriate> if there is a better way please advise

[20:08:03] <astropriate> anyone her?e

[20:39:57] <ptwobrussell> as a matter of curiosity, could someone point me to any docs that talk about the kinds of indexes/data structures that mongodb uses under the hood? I found this ticket https://jira.mongodb.org/browse/SERVER-380 but it's not obvious what data structure ends up getting plugged in from reading the comments? I know that b-trees are used for most storage, but what about the full text index? is it clucene? And what about the geo-indexes? Are they based o

[20:39:57] <ptwobrussell> R-trees?

[21:28:21] <leifw> ptwobrussell: as far as I can tell, everything is b-trees

[21:28:44] <leifw> ptwobrussell: the geo indexes are just done by massaging queries into something a b-tree can deal with

[21:30:34] <ptwobrussell> leifw - thanks, that's helpful. are you saying that you think even the full text index is b-tree based as well?

[21:31:16] <leifw> ptwobrussell: that is true, the full text index is built by splitting the text into tokens, removing stop words, and stemming, and then dumping the results (as big arrays of tokens) into a b-tree

[21:32:08] <leifw> so inserting {'text': "the quick brown fox jumped"} is a bit like inserting {'text': ['quick', 'brown', 'fox', 'jump']} into a normal index on {'text':1}

[21:32:34] <ptwobrussell> leifw - thanks, that's very helpful (and interesting)

[21:33:02] <leifw> the stemming might go the other way, I don't know, you might end up with ['quick', 'quickly', 'brown', 'browned', 'browning', 'fox', 'foxes', 'jump', 'jumped', 'jumping', 'jumps'] instead

[21:33:50] <ptwobrussell> leifw - probably the former would make the most sense and the query is stemmed to match what would be stored

[21:34:01] <leifw> ptwobrussell: I'd hope so

[21:34:02] <ptwobrussell> i wouldn't have guessed it would work that way. i'm a little bit intrigued

[21:34:09] <ptwobrussell> me too :)

[21:34:18] <leifw> ptwobrussell: I'm hoping to get FTS into TokuMX very soon, I think the insertion advantage would be killer there

[21:34:50] <leifw> basically piggyback on the array indexing advantage we have, but for something with more use cases

[21:35:03] <ptwobrussell> wow. that's really cool.

[21:40:23] <heewa> I'm trying to decide between raid 0 and raid 10 across EBS volumes on amazon. Mongo's website on production notes says this on the topic: "RAID-0 provides good write performance but provides limited availability, and reduced performance on read operations, particularly using Amazon’s EBS volumes: as a result, avoid RAID-0 with MongoDB deployments."

[21:40:40] <heewa> I'm trying to understand why raid 10 would have better read performance than raid 0. Any idea?

[21:41:49] <SproutDB> hey everyone, I am doing some research on database-as-a-service, would you please take my survey, giving away a $50 amazon gift card surveymonkey.com/s/C57JP8W

[21:41:56] <diegows> hi

[21:42:01] <heewa> Are they saying that raid 10 across 4 volumes has better read performance than raid 0 across --2-- volumes (basically assuming the mirror isn't there)? Or that it has better read performance than even raid 0 across 4 volumes?

[21:42:53] <leifw> if you have a good controller I imagine it would be better than RAID 0 across 4 volumes

[21:43:17] <leifw> with straight RAID 0 each request can be served by exactly one disk, so if you get unlucky you can get all your threads hitting the same disk

[21:44:26] <leifw> with RAID 10 each request can be served by two disks, so let's say all the even stripes are on the first two disks, now you can have two requests for even strips (spread out from each other) and each can be served by a separate disk head

[21:44:45] <leifw> RAID 1 has the best read performance, because every disk can serve every requst, but that slows down writes

[21:44:54] <leifw> it's kind of like sharding (RAID 0) vs. replication (RAID 1)

[21:45:26] <heewa> Great answer, thank you!

[21:45:46] <leifw> np

[22:09:20] <lacrymology> how do I query for x.foo != 0? I'm trying find({$not: { foo: 0}}) but it's not working

[22:11:01] <lacrymology> how do I query for x.foo != 0? I'm trying find({$ne: { foo: 0}}) but it's not working

[22:13:37] <lacrymology> got it

[22:54:21] <dougb> if I empty a collection and new documents are created, is there a possibility that the document could get the same ID as a previously created document? I saw that part of the ObjectID is based off of the unix epoch so I would think it's always unique

[23:03:19] <crudson> dougb: it's very unlikely, on the same machine, if the clock hasn't changed

[23:03:29] <crudson> read: not going to happen

[23:03:50] <dougb> cool, thank you!

[23:05:02] <crudson> note that clients create the ids http://docs.mongodb.org/manual/reference/object-id/ contains some helpful info

Log file Viewer

Help | Karma | Search:

#mongodb logs for Monday the 3rd of June, 2013