pmxbot IRC Log Viewer

[01:13:55] <leotr> hi! http://pastie.org/8002969... can't understand why ip field is not stored

[01:15:36] <leotr> what am i doing wrong?

[01:17:37] <leotr> i had to use $set... mongo is so mongo...

[01:17:40] <leotr> thank you

[03:14:15] <xilo> /names/names

[03:25:53] <xilo> looking to see what type of database to use. i have this scenario: several objects of type Foo with predetermined values for properties. Object class Bar could have a reference to Foo.id=1 and object blass Blah could also have a reference to Foo.id=1. can you set this up without being wonky ie different collections so i don't duplicate data? or should i stick to a RDBMS

[06:12:20] <laner> is it safe to run mms agent as root?

[07:18:37] <lessless> hi guys! how do I enable database listing in ubuntu? should I edit startup script and add --rest or there is an config option to do this?

[07:25:42] <Nodex> eh?

[08:23:14] <aandy> pretty sure it listens, either on /tmp/ socket or port 27017, by default

[08:23:43] <aandy> rest as wel. at least in some package managers (debian for instance)

[10:09:34] <hallas> How to query for elements where an id in an an array in that element?

[10:10:19] <Nodex> eh?

[10:10:48] <hallas> A document has an array of strings

[10:10:56] <hallas> I want to find those documents where a particular string is in that array

[10:11:07] <Nodex> db.foo.find({bar:"baz"});

[10:13:26] <hallas> doesnt work for me :/

[10:13:39] <Nodex> please pastebin a typical document

[10:41:17] <Nodex> or not

[10:59:19] <yorickpeterse> Hey folks, is there an easy way to get an overview of all the used fields in a Mongo collection? Preferrably taking nested structures into account. For example:

[10:59:33] <yorickpeterse> Given you have something like this: [{title: "foo", author: {name: "john"}}, {title:"bar", author: {age: 10}}]

[10:59:57] <yorickpeterse> You'd expect something like this as the result: { title: 2, author: 2, author.name: 1, author.age:1 }

[11:15:10] <crudson> yorickpeterse: in ruby: http://pastie.org/8004413 - bit of a hack but it's 4am

[11:15:50] <yorickpeterse> Hm, that would be an option. However, this will be done for a pretty big (30m objects or so) collection

[11:16:14] <yorickpeterse> You can probably do it using the aggregation framework/map reduce but I fear that will be something I can't look at 2 weeks after writing it

[11:16:36] <crudson> not for 30million, do a map reduce

[11:16:38] <yorickpeterse> But I'll keep the Ruby in mind just in case :)

[11:16:52] <yorickpeterse> Hm, map/reduce performs better than the aggregation framework in that case?

[11:17:01] <yorickpeterse> In most of our cases we've had the exact opposite in terms of performance

[11:17:02] <crudson> aggregation is in-memory

[11:17:08] <yorickpeterse> hmm

[11:17:40] <yorickpeterse> oh brb, lunch

[11:18:27] <crudson> bed here, if you wanted an example map reduce for this send me a pm, would be a nice little one to do

[11:32:35] <avril14th> Hello does mongo supports pipelining like "save 10 objects in one shot" ?

[11:33:06] <Snebjorn> yes

[11:33:22] <Snebjorn> can't remember what it's called

[11:33:22] <avril14th> okay. :)

[11:33:37] <avril14th> does that really improve perf?

[11:33:49] <avril14th> mongo seems quite fast at writing anyway

[11:34:35] <Snebjorn> http://docs.mongodb.org/manual/reference/method/db.collection.insert/

[11:34:37] <kali> bulk insert

[11:34:37] <Snebjorn> bulk insert

[11:34:47] <avril14th> thanks!

[11:36:21] <Snebjorn> as for speed gains, that I don't know

[11:37:02] <Snebjorn> but I would assume there are some as it only requires one connection

[11:37:16] <avril14th> okay

[11:37:26] <avril14th> seems like mongoid doesn't support it though :(

[11:37:45] <kali> it mostly saves round trips

[12:13:05] <schlitzer|work> hey alles, what is the recommended way to have a singlehost shard, a config server & a mongos on the same ubuntu (12.04) host?

[12:13:27] <schlitzer|work> this is for a development environment

[12:20:19] <schlitzer|work> ahh got it, it is an upstart job

[12:20:27] <schlitzer|work> thx anyways :-)

[12:41:46] <foofoobar> Hi. I just installed mongodb and took a look in the data dir. I have no entries in the database and the data folder has a size of 3.3GB??

[12:43:58] <foofoobar> journal/prealloc.1 1GB, journal/prealloc.2 1GB, journal/j._0 1GB

[12:44:02] <foofoobar> ??

[12:49:27] <double_p> take a wild guess that this could be answered in the FAQ

[12:51:45] <foofoobar> double_p: I know what this journaling is and how to deactivate

[12:51:55] <foofoobar> double_p: but why is mngodb journaling 3GB?

[12:52:02] <foofoobar> This seems a bit .. much

[13:00:34] <double_p> http://docs.mongodb.org/manual/reference/configuration-options/#smallfiles

[13:08:33] <foofoobar> double_p: so I set smallfiles to true and restarted, can I delete the old big journaling files now?

[13:11:53] <double_p> sure

[14:14:25] <barnes_> when are the .rpms of 2.4.4 available in the repo?

[14:26:13] <ixti> Hi all

[14:26:22] <ixti> What is the downside of nonAtomic map reduce?

[14:27:00] <ixti> what will happen if two nonAtomic map reduces will out into same collection?

[14:31:15] <diegows> iterating over docs and updating a field that is in one index isn't a good idea right?

[14:36:33] <arthurnn> curiosity: doest mongodb uses Red-Black Tree for maintaining the order of keys on embed docs?

[14:37:58] <Snebjorn> it uses a B tree for searching indexes

[14:38:42] <arthurnn> Snebjorn: sure.. but I am curious to know what do they use to maintain the order of keys in an embed doc.

[14:39:56] <Snebjorn> hmm, I'm actually not sure :)

[14:40:24] <Snebjorn> this is what I think happens

[14:40:33] <Snebjorn> it doesn't keep track of the order of embeded

[14:40:56] <Snebjorn> as you can't return a sub doc by index alone

[14:41:08] <Snebjorn> it'll return the entire document

[14:42:08] <Snebjorn> so it just stores the location of the main document and don't care about the order of sub docs

[14:42:28] <ixti> I think there are two things

[14:42:38] <ixti> e.g.

[14:43:11] <ixti> { _id: ..., title: "Foo", comments: [ { author: "...", content: "..." } ] }

[14:43:22] <ixti> now index:

[14:44:10] <ixti> { "comments.author": 1 }

[14:44:33] <ixti> that index falls into B-Tree as far as I understand

[14:45:07] <ixti> when we talk about something like this:

[14:45:14] <arthurnn> Snebjorn: actually i does keep the order. try something like this: db.entities.update({ "_id": ObjectId("51adf6957c71c1cc08000001") }, {"$set": {"links.0.a": 1370355778} }); db.entities.update({ "_id": ObjectId("51adf6957c71c1cc08000001") }, {"$set": {"links.0.c": 1370355778} }); db.entities.update({ "_id": ObjectId("51adf6957c71c1cc08000001") }, {"$set": {"links.0.b": 1370355778} })

[14:45:41] <ixti> arthurnn: it keeps alphabetical order AFAIK

[14:45:45] <arthurnn> when u do db.entities.find()['links'] , u will see that the first link has Orderd keys

[14:46:06] <arthurnn> ixti: thats what I noticed today... wondering WHY and HOW ?

[14:46:30] <arthurnn> and is that a BSON thing or a Mongodb thing?

[14:46:54] <ixti> arthurnn: gotta run, but I saw some explanation on docs that it "runs" through all nested docs

[14:47:09] <ixti> indexing and searching is a MOngo thing

[14:47:19] <ixti> BSON is just a format of serialization

[14:48:20] <Snebjorn> arthurnn, try reading about indexing in this book http://it-ebooks.info/book/964/

[14:48:27] <Snebjorn> it's really good

[14:48:57] <Snebjorn> I can't remember if it explains about sub doc ordering

[14:48:57] <arthurnn> Snebjorn: i am not sure if this alphabetical order on subdocs is related to index at all or not.

[14:49:26] <arthurnn> Snebjorn: in my example I dont have any index on my collections.. and the order on the keys are still there.

[14:49:42] <Snebjorn> hmm

[14:50:04] <Nodex> are you asking how they're stored in the index or how they're stored by mongo in relation to how they appear in the shell?

[14:51:35] <ixti> arthurnn: google mongo docs re structure of big nested docs. not sure how it was called

[14:51:46] <arthurnn> Nodex: my question has nothing to do with indexes... i am just asking 1. Why do they keep the keys ordered on a embed doc? and HOW? RB tree?

[14:52:18] <Nodex> the keys are not ordered

[14:52:51] <arthurnn> yes they are... try the example I posted above.. u will see.. everytime u $set that new key wont go to the end of the doc...

[14:52:55] <Nodex> each object in an array after save -may- get ordered but embedded array's are not ordered for obvious reasons

[14:53:43] <arthurnn> yepp.. i am not talking about arrays.. i am talking about a embed doc.(hash)

[14:53:54] <Nodex> please pastebin the document in question

[14:54:25] <Nodex> because your update looks like an array to me .... links :[{b:1234.....}]

[14:57:29] <arthurnn> Nodex: https://gist.github.com/arthurnn/5706565

[14:57:42] <Nodex> the document

[14:57:49] <arthurnn> it is an array... but it does not have to be... i am using $set and not $push

[14:58:23] <Nodex> $set works with arrays

[14:58:24] <arthurnn> Nodex: the doc is totally irrelevant. run those commands u will see in the last find that the keys on links.0 are ordered.

[14:58:26] <Nodex> as does $push

[14:58:35] <Nodex> no, the document is NOT totally irrelevant

[15:02:54] <failshell> hey guys. im trying to restore a backup we did with LVM snapshots of a sharded cluster. on a different set of VMs. obviously, its failing because the machines names changed. can anyone give me a brief crash course on doing that?

[15:03:46] <Nodex> :/

[15:35:18] <jpfarias_> good morning everyone

[15:35:44] <jpfarias_> I have this collection where I added a coordinates field with [lat, lng] format

[15:36:01] <jpfarias_> later I found on the documentation that it should really be [lng, lat]

[15:36:13] <jpfarias_> does that really matter?

[15:36:30] <Nodex> pretty sure the docs DONT say that

[15:36:39] <Nodex> all of my locations are [lat,long]

[15:37:50] <vargadanis> good morning!

[15:38:25] <vargadanis> is mongoDB able to store geo data like that?

[15:38:36] <Nodex> yes

[15:38:36] <aandy> Nodex: oh no, your db has been buggy all this time, and you've just realized ;)

[15:38:37] <vargadanis> Nodex, I was saying hello to jpfarias_ :D

[15:38:57] <Nodex> aandy : for 3 years I have been giving out bad results :/ .... omgz

[15:39:08] <vargadanis> Nodex, is that a built in data type for mongo or ... linky please? :D I'd like to read about this

[15:39:10] <jpfarias_> Nodex: I'm on the pacific coast of the US :)

[15:39:22] <jpfarias_> hold on, I'll find the link

[15:39:25] <aandy> this blows

[15:39:27] <Nodex> vargadanis : it's an array

[15:39:49] <vargadanis> ohh.. that simple O_o alright

[15:39:50] <Nodex> you can also use an object ... location : {latitutde:0.1,longitude:0.5}

[15:40:01] <Nodex> then apply a 2d / geo spatial index to it :)

[15:40:04] <aandy> i've always used named, to be safe (from myself)

[15:40:15] <ixti> will mapReduce with nonAtomic will blow up the world upon concurrent mapReduces?

[15:40:15] <jpfarias_> http://docs.mongodb.org/manual/applications/geospatial-indexes/

[15:40:24] <jpfarias_> To calculate geometry over an Earth-like sphere, store your location data on a spherical surface and use 2dsphere index.

[15:40:24] <jpfarias_> Store your location data as GeoJSON objects with this coordinate-axis order: longitude, latitude. The coordinate reference system for GeoJSON uses the WGS84 datum.

[15:40:24] <Nodex> I used to use named but it takes up to much space for no reason

[15:40:57] <Nodex> jpfarias_ : you should make sure that's NOT a typo

[15:40:58] <jpfarias_> guess it only matters for 2dsphere

[15:41:06] <aandy> Nodex: true. not a factor for my pet project, but worth considering, right. probably an okay cost when prototyping though, to avoid these bugs :)

[15:41:19] <Nodex> I can assure you that lat,long works fine

[15:41:30] <jpfarias_> it's been working fine for me too

[15:41:44] <jpfarias_> I am just worried that it may stop working on future releases if they change something

[15:42:02] <aandy> they wouldn't change that

[15:42:05] <jpfarias_> but seems to be a 2dsphere index requirement only

[15:42:05] <aandy> really shouldn't

[15:42:13] <jpfarias_> I use 2d index

[15:42:43] <jpfarias_> the other question that I have is

[15:43:06] <Nodex> We need some 10gen confirmation on this

[15:43:18] <Nodex> because all my apps just broke if they suddenly changed it

[15:43:23] <jpfarias_> when I do a {$within: {$center: [[lat, lng], radius]}} query it takes a long that the first time I query it

[15:43:30] <jpfarias_> like really long time

[15:43:36] <jpfarias_> ~ 30 secs sometimes

[15:43:41] <Nodex> what number is your radius?

[15:43:47] <jpfarias_> then when I send the query again it returns instantly

[15:43:50] <jpfarias_> 0.015

[15:43:59] <Nodex> how many docs ?

[15:44:09] <jpfarias_> between 100 and 300

[15:44:24] <Nodex> something is not right there

[15:44:32] <jpfarias_> yeah I just don't know what

[15:44:33] <jpfarias_> :)

[15:44:47] <jpfarias_> I do have like 4million docs on the collection tho

[15:44:50] <Nodex> what specs are your machine?

[15:44:53] <jpfarias_> 100 to 300 is the results size

[15:44:57] <vargadanis> I was reading about read performance of MongoDB. In a situation where it is not critical to have the most up to date information, I could direct the reads to specific replica sets... However the same docs page sais that I would be smarter to just direct all to the primary and use sharding instead. Are the 2 mutually exclusive?

[15:45:12] <jpfarias_> it's a pretty fucking good machine :)

[15:45:23] <Nodex> vargadanis : sharding is not mutualy exclusive

[15:45:29] <jpfarias_> 24GB of ram dual xeon

[15:45:37] <jpfarias_> I can double check

[15:45:37] <Nodex> and a "shard" may NOT have your data on it

[15:45:50] <Nodex> what memory is mongodb taking jpfarias_ ?

[15:46:07] <jpfarias_> how do I check that?

[15:46:11] <jpfarias_> whatever top says?

[15:46:18] <Nodex> yeh

[15:46:20] <Nodex> "res"

[15:46:31] <jpfarias_> 8.3g

[15:46:48] <Nodex> can you tail -f /var/log/mongodb.log and run the query

[15:46:54] <jpfarias_> it is a 16 core machine

[15:46:56] <Nodex> paste the output about the time taken

[15:47:26] <jpfarias_> 2.27 ghz on each core :)

[15:47:38] <jpfarias_> Intel(R) Xeon(R) CPU E5520 @ 2.27GHz

[15:47:47] <jpfarias_> ok

[15:53:38] <jpfarias_> Tue Jun 4 11:48:15 [conn8645] getmore mls_data.mls_data query: { coordinates: { $within: { $center: [ [ 34.093418, -118.202126 ], 0.015 ] } }, $or: [ { listing_info.status: { $in: [ "active", "pending", "on hold", "backup" ] } }, { listing_info.status: "sold", listing_info.sold_date: { $gte: new Date(1338824884659) } } ] } cursorid:3277315563308696778 ntoreturn:0 keyUpdates:0 locks(micros) r:4588433 nreturned:161

[15:53:38] <jpfarias_> reslen:693541 4588ms

[15:53:50] <Nodex> on 27M docs on my machine the same (ish) query takes 2.7 secnds

[15:53:50] <jpfarias_> this took 4.5s

[15:54:16] <Nodex> make sure you have an index

[15:54:23] <jpfarias_> I'm pretty sure it does

[15:54:28] <Nodex> db.foo.getIndexes()

[15:54:29] <jpfarias_> on the coordinates :)

[15:55:02] <jpfarias_> {

[15:55:03] <jpfarias_> "v" : 1,

[15:55:04] <jpfarias_> "key" : {

[15:55:06] <jpfarias_> "coordinates" : "2d"

[15:55:07] <jpfarias_> },

[15:55:08] <jpfarias_> "ns" : "mls_data.mls_data",

[15:55:09] <jpfarias_> "name" : "coordinates_2d"

[15:55:10] <jpfarias_> },

[15:55:15] <jpfarias_> it is one of them

[15:55:15] <Nodex> use a pastebin :)

[15:55:18] <jpfarias_> sorry

[15:56:47] <vargadanis> it is soo damn hard to get started with MongoDB ! :) I wish I had a guy to whom I told what kind of data I want to store, what is the average read/write performance and he'd set it up to me and just told me: it's gonna work, go do your programming! :D

[15:56:56] <dougb> is {"Field":1} essentially the same as creating an indexed field in MySQL?

[15:57:53] <Nodex> dougb : sort of

[15:58:42] <vargadanis> dougb, as far as I can tell, you need to call ensureIndex( { "Field": 1});

[15:58:45] <jpfarias_> so the index is there

[15:59:03] <vargadanis> which will create a secondary index on the field (do you use this term at all?) called "Field"

[15:59:06] <jpfarias_> what could be making it slower?

[15:59:16] <Nodex> jpfarias_ : perhaps the bounds of the index

[15:59:24] <jpfarias_> what you mean?

[15:59:45] <dougb> I'm using MongoHub right now as a client, so I just enter in a query string under 'index' but I assume it's running it through ensureIndex

[15:59:47] <Nodex> I remoe the index and my query goes to 4 seconds

[15:59:58] <dougb> I've tested locally and it seems to work

[16:00:01] <Nodex> I add it and the query is then ~15ms

[16:00:09] <jpfarias_> well

[16:00:13] <jpfarias_> that's the thing

[16:00:15] <kali> vargadanis: you'll do it wrong anyway. it's always wrong the first time. so go do your programming, it won't work, we'll help you fix it, and you'll learn something :)

[16:00:19] <jpfarias_> it is only slow the first time

[16:00:25] <jpfarias_> when I requery it goes really fast

[16:00:33] <jpfarias_> for the same parameters that is

[16:00:51] <jpfarias_> when the location change it goes slow again the 1st time

[16:00:55] <Nodex> that's because it's cached

[16:01:01] <jpfarias_> then fast on next query for same location

[16:01:05] <jpfarias_> right

[16:01:14] <jpfarias_> can I make it cache everything? :P

[16:01:18] <Nodex> no

[16:01:29] <jpfarias_> :(

[16:01:31] <vargadanis> kali, haha yeah.. figures :)

[16:02:10] <vargadanis> dougb, I guess that is an application specific thing, but I guess so. I mean it would be the Vulcan (logical) way to do it for me

[16:02:10] <Nodex> you have somehting wrong jpfarias_ : Mine is instant

[16:02:46] <vargadanis> like the soup?

[16:02:49] <jpfarias_> for different locations?

[16:03:32] <Nodex> http://pastebin.com/HVsC0R4b

[16:03:37] <Nodex> location doesnt matter

[16:03:44] <Nodex> it's not being cached

[16:04:23] <Nodex> I can change the lat/long pair and it's still instant is what I mean

[16:04:33] <double_p> so.. i have this offsite replica. lowest priority and all fine. but it _happens_ that it loses connectivy and then reelection blues (quick, but it DROPS connections). even if i dislike, i would put it to hidden. but that doesnt prevent election if lost.

[16:04:56] <jpfarias_> http://pastebin.com/bnPTByfQ

[16:05:16] <jpfarias_> my indexBounds on coordinates is empty

[16:05:19] <jpfarias_> why is that?

[16:06:46] <jpfarias_> oh you have 1 document only

[16:06:52] <jpfarias_> I got 4448

[16:06:53] <jpfarias_> lol

[16:06:53] <Nodex> eh LOL

[16:07:02] <Nodex> I don't have 1 document

[16:07:16] <jpfarias_> well that query you did seems to return only 1

[16:07:21] <vargadanis> I read that it is a good idea to create shards per core over shards per server for better perf. however is there a way to actually limit a mongod instance to a core or even "dedicate" a core to one? Working under linux system..

[16:07:22] <Nodex> nope, 8

[16:07:36] <jpfarias_> "n" : 1,

[16:07:53] <jpfarias_> isn't that the number of docs returned?

[16:08:41] <jpfarias_> anyway

[16:08:53] <Nodex> http://pastebin.com/pPeRnQnS

[16:08:54] <jpfarias_> why did my indexBounds got empty?

[16:08:55] <Nodex> 543

[16:09:07] <jpfarias_> there you go

[16:09:11] <jpfarias_> a much better sample

[16:09:25] <jpfarias_> damn you get it back in 5ms

[16:09:43] <jpfarias_> I really have no idea why mine is taking so long

[16:10:07] <jpfarias_> now your indexBounds is empty too

[16:10:12] <Nodex> http://pastebin.com/8Td1yy8p

[16:10:20] <Nodex> 30k~ and still 105ms

[16:10:37] <Nodex> 155*

[16:10:50] <jpfarias_> lemme try a bigger radius on mine too

[16:11:27] <jpfarias_> oh boy

[16:11:30] <jpfarias_> did radius == 1

[16:11:39] <jpfarias_> gonna take like a minute I think

[16:11:40] <jpfarias_> lol

[16:11:43] <jpfarias_> maybe more :P

[16:12:25] <jpfarias_> Nodex: what is your machine spec?

[16:12:54] <Nodex> 2x hexcore 64gb ram, ssd

[16:13:00] <jpfarias_> oh cmon

[16:13:19] <jpfarias_> yours is like 4x or 5x faster than mine then

[16:13:20] <jpfarias_> :P

[16:13:44] <Nodex> OK I'll run it on a lessrer spec, 1 sec

[16:13:45] <jpfarias_> no 64gb of ram too :P

[16:15:07] <jpfarias_> http://pastebin.com/3aRj3qAR

[16:15:44] <jpfarias_> ~32k docs

[16:15:48] <jpfarias_> 9.7 secs

[16:16:01] <jpfarias_> looked at 56k docs

[16:16:34] <Nodex> http://pastebin.com/47zNV7jx

[16:16:41] <Nodex> geonames database ~2M

[16:17:00] <jpfarias_> on same machine?

[16:17:04] <jpfarias_> try a bigger radius

[16:17:06] <Nodex> nope

[16:17:13] <Nodex> on a machine with 16gb ram and 1x quadcore

[16:17:14] <jpfarias_> see if you can get > 1k docs

[16:17:33] <Nodex> and no ssd

[16:17:38] <jpfarias_> k

[16:17:48] <jpfarias_> that's a good machine to test this :)

[16:17:56] <Nodex> 959 docs = 35ms

[16:18:09] <jpfarias_> ok

[16:18:14] <jpfarias_> now I feel sad

[16:18:32] <jpfarias_> why is mine so slow?

[16:19:11] <jpfarias_> what version of mongo do you have

[16:19:15] <jpfarias_> I'm running 2.2.3

[16:19:21] <jpfarias_> but I can upgrade to latest

[16:19:27] <jpfarias_> if that helps...

[16:19:56] <Nodex> 2.4 here

[16:19:57] <Nodex> bbbs

[16:20:01] <Nodex> needa shower

[16:20:20] <jpfarias_> k

[16:20:22] <jpfarias_> gonna upgrade mine

[16:24:16] <vargadanis> is there some kinda cool GUI like app for web or desktop I could use to monitor/manage what is going on with my DB?

[16:26:30] <jpfarias_> http://pastebin.com/1SXbFE4r

[16:26:33] <jpfarias_> damn look at that

[16:26:37] <jpfarias_> 48s

[16:26:48] <jpfarias_> after upgrading

[16:26:49] <jpfarias_> lol

[16:28:27] <jpfarias_> when I resend the query it takes 425ms

[16:29:50] <vargadanis> jpfarias_, you mean you resend the same query and once it is 48ms and then 425?

[16:30:03] <jpfarias_> 48 seconds

[16:30:12] <jpfarias_> down to 425 milli seconds

[16:30:34] <jpfarias_> returning 32k docs

[16:30:41] <vargadanis> ohh

[16:30:55] <vargadanis> 425ms is decent, isn't it?

[16:31:41] <jpfarias_> yes

[16:31:45] <jpfarias_> but 48 seconds is not

[16:31:46] <jpfarias_> :)

[16:32:33] <vargadanis> I suppose the docs for the 2nd query were already in memory or the query was cached... or why the huge performance increase?

[16:32:40] <jpfarias_> that's my only problem

[16:32:45] <jpfarias_> the 1st time I query it

[16:32:55] <jpfarias_> that should take < 1sec

[16:33:06] <jpfarias_> but sometimes it goes up to > 30 secs

[16:33:18] <jpfarias_> then 2nd time and on is always fast

[16:36:10] <vargadanis> I've read - though it's a bit different - to optimize queries it is sometimes a good idea query the entire data set and then run the query which "looks" for something.. but it only works for smaller data sets that fit into the memory

[16:36:21] <vargadanis> and also this is from not much of a trusted source

[16:36:58] <vargadanis> guess it's due to the fact that the first query runs slow is because of disk IO? O_o dunno

[16:37:30] <vargadanis> in any way, I gotta run now :) c ya nice folks around

[17:25:15] <failshell> im trying to reconfigure a sharded cluster from backups. problem is, the cluster sees the new shards and the old ones. every command i try fails because of that. tried db.runCommand( removeShard), but its failing because the balancer is not running

[17:25:19] <failshell> kind of a catch 22

[17:27:35] <devastor> We just got "Tue Jun 4 17:08:29 [repl writer worker 2] production.messages Assertion failure type == cbindata src/mongo/db/key.cpp 585" with mongo db 2.2.3, any known bugs that might cause that?

[17:27:55] <devastor> It crashed mongo and it doesn't start again

[17:34:41] <devastor> [repl writer worker 2] ERROR: writer worker caught exception: assertion src/mongo/db/key.cpp:585 on { long document data here or part of it }

[17:34:59] <devastor> [repl writer worker 2] Fatal Assertion 16360 dies to this

[17:37:34] <starfly> failshell: Dude, most folks probably don't want to weigh in on that dicey situation. I'd guess that they only way you can effectively recover is to build an environment that looks like (operating system, hostnames, hardware) very similar to what you backed up previously and recover into that. Comingling old and new shards sounds like a losing proposition.

[17:38:11] <failshell> well, its not old and new

[17:38:20] <failshell> i have a production backup that i need to restore to stasging

[17:38:23] <failshell> for our devs

[17:38:30] <failshell> so all the config points to wrong hosts

[17:39:30] <starfly> failshell: so, build the staging environment out like it was production. You can use the same hostnames, just can't obviously do that with DNS, but only local to the staging hosts...

[17:39:57] <failshell> that's fucked up

[17:40:00] <failshell> if that's the only way

[17:41:34] <starfly> failshell: when hostnames are embedded in configurations, what else can be expected?

[17:41:54] <failshell> change hostnames and point to data

[17:42:00] <failshell> like you can do with pretty much anything

[17:42:08] <failshell> oracle,mysql,postgres,couchdb,etc

[17:42:12] <failshell> DB2

[17:42:33] <starfly> failshell: OK, well I've worked 25+ years with Oracle DBA and I wouldn't agree that it's always easy...

[17:43:04] <failshell> with do it all the time here with no issues...

[17:43:23] <starfly> failshell: fair enough, best of luck

[17:55:42] <jpfarias_> Nodex: you back?

[20:36:17] <jpfarias__> niemeyer: ping

[20:36:25] <niemeyer> jpfarias__: Heya

[20:36:29] <jpfarias__> hi there!

[20:36:59] <miskander> I need to find all records based on an attributes in a different collection. For example find all comments where the user role is 'author'

[20:37:16] <jpfarias__> I'm not sure if you are the right person for this, but why would mongo take too long to answer for some query, even when the index is there?

[20:37:31] <jpfarias__> http://pastebin.com/1SXbFE4r

[20:37:42] <miskander> I come from Mysql land, so first reaction is one query with a join but that doesnt work out in Mongo :)

[20:38:27] <jpfarias__> this is a collection with 4 million records

[20:38:38] <jpfarias__> this query returned 31k documents

[20:38:44] <jpfarias__> and took 48seconds to run

[20:38:49] <jpfarias__> which seems impossible

[20:38:51] <jpfarias__> lol

[20:39:00] <jpfarias__> shouldn't it take < 1 second

[20:39:01] <jpfarias__> ?

[20:40:52] <niemeyer> jpfarias__: Not sure.. it's loaded 30k objects, and hasn't used indexes

[20:41:24] <jpfarias__> so should I use a different query then?

[20:41:35] <jpfarias__> something like a box would be better?

[20:41:53] <jpfarias__> I didn't know $center didn't use the index

[20:42:35] <niemeyer> jpfarias__: I have never used the geo features of MongoDB, so I'm not a good person to walk you through this, but I'm telling you what that query you made is showing you

[20:43:11] <niemeyer> jpfarias__: Do you have a 2d index in the proper fields?

[20:43:17] <jpfarias__> yep

[20:43:27] <jpfarias__> '2d' index on the 'coordinates' field

[20:43:53] <jpfarias__> this is the entry on the indexSizes: "coordinates_2d" : 158401824,

[20:46:02] <kali> jpfarias__: is your dataset in RAM ? loading 31k from disk can take a while...

[20:46:17] <kali> miskander: you have to denormalize: add the user role alongside the user id in the comment collection

[20:47:11] <miskander> I'll have to maintain the role in 2 places is that the best way?

[20:47:34] <kali> miskander: yes

[20:47:44] <miskander> kali: Thanks

[20:49:00] <kali> miskander: you need to think and design your data model to fit the specific needs of your apps, not as a pristine theoretical thing

[20:49:16] <miskander> kali: gotcha

[20:51:55] <niemeyer> jpfarias__: http://paste.ubuntu.com/5733921/

[20:52:00] <niemeyer> jpfarias__: That's how a query using the index looks like

[20:52:31] <jpfarias__> I see the index bounds when I query for a smaller radius

[20:52:43] <jpfarias__> but when the radius gets bigger it comes out empty

[20:53:42] <jpfarias__> http://paste.ubuntu.com/5733926/

[20:54:13] <jpfarias__> also, I see in many places in the documentation to use longitude first

[20:54:18] <jpfarias__> and I am using latitude first

[20:54:21] <jpfarias__> is that a problem?

[20:54:48] <jpfarias__> i.e., my coordinates look like: [lat, lng] instead of [lng, lat]

[20:55:23] <jpfarias__> http://docs.mongodb.org/manual/reference/operator/center/#op._S_center

[20:55:48] <jpfarias__> there's an "Important: If you use longitude and latitude, specify longitude first."

[20:56:02] <jpfarias__> I only saw this long after the system was in production

[20:56:03] <jpfarias__> lol

[20:59:57] <niemeyer> jpfarias__: Again, without knowledge of the implementation, it shouldn't make a difference if the 2d index is indeed in an Euclidian space

[21:00:33] <jpfarias__> yeah that's what I thought too

[21:01:02] <niemeyer> jpfarias__: In other words, if it's flat and bounded on -180-180 on both values

[21:01:03] <jpfarias__> but since you made the geohash algorithm I thought you'd know better :)

[21:01:07] <niemeyer> But..

[21:01:23] <niemeyer> jpfarias__: That's unrelated to geohashes

[21:01:38] <niemeyer> jpfarias__: Geohashes defines how the points are encoded, not how the indexing works

[21:02:03] <jpfarias__> well, I wasn't sure if the order would make the geohash bad

[21:02:26] <niemeyer> jpfarias__: It definitely makes it "bad"

[21:02:40] <jpfarias__> oops

[21:02:41] <niemeyer> jpfarias__: In the sense the coordinates will be completely wrong

[21:03:40] <niemeyer> jpfarias__: If everything just works on a flat 2d space, and you swap x/y, it'd still work, though.. but given the comment in the page, I'd ask someone that knows better than I do

[21:04:18] <niemeyer> jpfarias__: Once you start using spherical coordinates, that'll all be completely bogus for sure

[21:05:01] <jpfarias__> I see

[21:08:24] <jpfarias__> but would that be the reason the query is slow?

[21:12:24] <niemeyer> jpfarias__: That query is running of 30k documents, yielding 600 times.. I don't know where you're running this, but it's definitely not surprising that it's misperforming.

[21:12:46] <jpfarias__> this was just a test

[21:12:53] <jpfarias__> usually it returns 100-300 documents

[21:13:19] <jpfarias__> but still sometimes it takes around 30 secs

[21:13:24] <jpfarias__> for those 300 documents

[21:17:06] <tystr> hm having trouble getting the mms agent working

[21:17:22] <jpfarias__> anyway

[21:17:29] <tystr> the agent from this host shows up under the agents tab, but it never seems to submit any data

[21:17:34] <jpfarias__> I will make a second attribute and put it in the right order

[21:17:37] <jpfarias__> lng, lat

[21:17:46] <jpfarias__> and start moving the queries to use that

[21:17:55] <jpfarias__> to see if that fixes the problem

[21:22:56] <tystr> only thin I see in the logs are "Starget agent process" stuff

[21:22:59] <tystr> s/thin/thing/

[21:23:06] <tystr> any ideas?

[21:25:41] <niemeyer> jpfarias__: Have you investigated queries around the latitude boundary (90 or -90)?

[21:26:13] <niemeyer> jpfarias__: The docs say it's the limits are bounded at 180, but I wouldn't be surprised if part of the logic is actually enforcing the correct boundaries.

[21:26:27] <niemeyer> jpfarias__: Strictly speaking your query has an invalid latitude.

[21:26:32] <jpfarias__> right

[21:26:53] <jpfarias__> that's why I am trying to go to the right notation first

[21:27:07] <jpfarias__> but must of my data is in california

[21:27:20] <niemeyer> jpfarias__: You might also just test the same query against a point without the correct boundaries

[21:27:28] <niemeyer> Erm.. within

[21:27:43] <jpfarias__> I probably have 0 points like that :)

[21:27:51] <jpfarias__> everything is in the box of california

[21:27:57] <jpfarias__> with is < -100 longitude

[21:28:05] <jpfarias__> *which

[21:29:04] <niemeyer> Okay, yeah, so fixing the data would be the first step

[21:32:06] <tystr> ah I'm trying to run 2 agents separately in the same group in mms…this is the problem

[21:36:34] <jpfarias__> niemeyer: 5% done :)

[21:36:50] <niemeyer> jpfarias__: Well done! ;-)

[21:37:13] <jpfarias__> probably gonna take 30 minutes ~ 1h to go thru all of them

[21:37:29] <jpfarias__> hmm, do you think it would be much faster if I update them first then create the index?

[21:37:40] <jpfarias__> I created the index first then now I am updating

[21:38:14] <tomlikestorock> is it possible to do where $regex $in a list of regexes?

[21:51:09] <jpfarias__> tomlikestorock: you can probably make a complicated enough regex that matches :D

[21:51:27] <jpfarias__> (r1|r2|r3)

[21:51:33] <tomlikestorock> jpfarias__: yeah, but I saw you could definitely do a $in on regexes, so did that instead

[21:51:46] <jpfarias__> oh really?

[21:51:49] <ackspony> regex is nasty :(

[21:51:50] <jpfarias__> that's interesting

[21:58:14] <jcalvinowens> Quick question: is there a way to issue raw mongo shell commands from pymongo? Want to use forEach(), but not callable from Python.

[21:58:35] <jpfarias__> tomlikestorock: does it have any performance penalty for the $in instead of the big regex?

[21:58:52] <tomlikestorock> jpfarias__: no, it seems to use the appropriate indexes and be fine

[21:58:58] <jpfarias__> cool

[23:52:13] <Aartsie> Hi all

[23:52:25] <Aartsie> is it save to update the mongoDB by apt-get update on my machine ?

[23:54:55] <ackspony> safe...

[23:55:03] <ackspony> package managers are generally safe

Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 4th of June, 2013