[03:25:53] <xilo> looking to see what type of database to use. i have this scenario: several objects of type Foo with predetermined values for properties. Object class Bar could have a reference to Foo.id=1 and object blass Blah could also have a reference to Foo.id=1. can you set this up without being wonky ie different collections so i don't duplicate data? or should i stick to a RDBMS
[06:12:20] <laner> is it safe to run mms agent as root?
[07:18:37] <lessless> hi guys! how do I enable database listing in ubuntu? should I edit startup script and add --rest or there is an config option to do this?
[10:59:19] <yorickpeterse> Hey folks, is there an easy way to get an overview of all the used fields in a Mongo collection? Preferrably taking nested structures into account. For example:
[10:59:33] <yorickpeterse> Given you have something like this: [{title: "foo", author: {name: "john"}}, {title:"bar", author: {age: 10}}]
[10:59:57] <yorickpeterse> You'd expect something like this as the result: { title: 2, author: 2, author.name: 1, author.age:1 }
[11:15:10] <crudson> yorickpeterse: in ruby: http://pastie.org/8004413 - bit of a hack but it's 4am
[11:15:50] <yorickpeterse> Hm, that would be an option. However, this will be done for a pretty big (30m objects or so) collection
[11:16:14] <yorickpeterse> You can probably do it using the aggregation framework/map reduce but I fear that will be something I can't look at 2 weeks after writing it
[11:16:36] <crudson> not for 30million, do a map reduce
[11:16:38] <yorickpeterse> But I'll keep the Ruby in mind just in case :)
[11:16:52] <yorickpeterse> Hm, map/reduce performs better than the aggregation framework in that case?
[11:17:01] <yorickpeterse> In most of our cases we've had the exact opposite in terms of performance
[12:13:05] <schlitzer|work> hey alles, what is the recommended way to have a singlehost shard, a config server & a mongos on the same ubuntu (12.04) host?
[12:13:27] <schlitzer|work> this is for a development environment
[12:20:19] <schlitzer|work> ahh got it, it is an upstart job
[12:41:46] <foofoobar> Hi. I just installed mongodb and took a look in the data dir. I have no entries in the database and the data folder has a size of 3.3GB??
[14:50:04] <Nodex> are you asking how they're stored in the index or how they're stored by mongo in relation to how they appear in the shell?
[14:51:35] <ixti> arthurnn: google mongo docs re structure of big nested docs. not sure how it was called
[14:51:46] <arthurnn> Nodex: my question has nothing to do with indexes... i am just asking 1. Why do they keep the keys ordered on a embed doc? and HOW? RB tree?
[14:58:35] <Nodex> no, the document is NOT totally irrelevant
[15:02:54] <failshell> hey guys. im trying to restore a backup we did with LVM snapshots of a sharded cluster. on a different set of VMs. obviously, its failing because the machines names changed. can anyone give me a brief crash course on doing that?
[15:40:24] <jpfarias_> To calculate geometry over an Earth-like sphere, store your location data on a spherical surface and use 2dsphere index.
[15:40:24] <jpfarias_> Store your location data as GeoJSON objects with this coordinate-axis order: longitude, latitude. The coordinate reference system for GeoJSON uses the WGS84 datum.
[15:40:24] <Nodex> I used to use named but it takes up to much space for no reason
[15:40:57] <Nodex> jpfarias_ : you should make sure that's NOT a typo
[15:40:58] <jpfarias_> guess it only matters for 2dsphere
[15:41:06] <aandy> Nodex: true. not a factor for my pet project, but worth considering, right. probably an okay cost when prototyping though, to avoid these bugs :)
[15:41:19] <Nodex> I can assure you that lat,long works fine
[15:41:30] <jpfarias_> it's been working fine for me too
[15:41:44] <jpfarias_> I am just worried that it may stop working on future releases if they change something
[15:44:53] <jpfarias_> 100 to 300 is the results size
[15:44:57] <vargadanis> I was reading about read performance of MongoDB. In a situation where it is not critical to have the most up to date information, I could direct the reads to specific replica sets... However the same docs page sais that I would be smarter to just direct all to the primary and use sharding instead. Are the 2 mutually exclusive?
[15:45:12] <jpfarias_> it's a pretty fucking good machine :)
[15:45:23] <Nodex> vargadanis : sharding is not mutualy exclusive
[15:56:47] <vargadanis> it is soo damn hard to get started with MongoDB ! :) I wish I had a guy to whom I told what kind of data I want to store, what is the average read/write performance and he'd set it up to me and just told me: it's gonna work, go do your programming! :D
[15:56:56] <dougb> is {"Field":1} essentially the same as creating an indexed field in MySQL?
[15:59:45] <dougb> I'm using MongoHub right now as a client, so I just enter in a query string under 'index' but I assume it's running it through ensureIndex
[15:59:47] <Nodex> I remoe the index and my query goes to 4 seconds
[15:59:58] <dougb> I've tested locally and it seems to work
[16:00:01] <Nodex> I add it and the query is then ~15ms
[16:00:15] <kali> vargadanis: you'll do it wrong anyway. it's always wrong the first time. so go do your programming, it won't work, we'll help you fix it, and you'll learn something :)
[16:00:19] <jpfarias_> it is only slow the first time
[16:00:25] <jpfarias_> when I requery it goes really fast
[16:00:33] <jpfarias_> for the same parameters that is
[16:00:51] <jpfarias_> when the location change it goes slow again the 1st time
[16:02:10] <vargadanis> dougb, I guess that is an application specific thing, but I guess so. I mean it would be the Vulcan (logical) way to do it for me
[16:02:10] <Nodex> you have somehting wrong jpfarias_ : Mine is instant
[16:04:23] <Nodex> I can change the lat/long pair and it's still instant is what I mean
[16:04:33] <double_p> so.. i have this offsite replica. lowest priority and all fine. but it _happens_ that it loses connectivy and then reelection blues (quick, but it DROPS connections). even if i dislike, i would put it to hidden. but that doesnt prevent election if lost.
[16:07:16] <jpfarias_> well that query you did seems to return only 1
[16:07:21] <vargadanis> I read that it is a good idea to create shards per core over shards per server for better perf. however is there a way to actually limit a mongod instance to a core or even "dedicate" a core to one? Working under linux system..
[16:33:06] <jpfarias_> but sometimes it goes up to > 30 secs
[16:33:18] <jpfarias_> then 2nd time and on is always fast
[16:36:10] <vargadanis> I've read - though it's a bit different - to optimize queries it is sometimes a good idea query the entire data set and then run the query which "looks" for something.. but it only works for smaller data sets that fit into the memory
[16:36:21] <vargadanis> and also this is from not much of a trusted source
[16:36:58] <vargadanis> guess it's due to the fact that the first query runs slow is because of disk IO? O_o dunno
[16:37:30] <vargadanis> in any way, I gotta run now :) c ya nice folks around
[17:25:15] <failshell> im trying to reconfigure a sharded cluster from backups. problem is, the cluster sees the new shards and the old ones. every command i try fails because of that. tried db.runCommand( removeShard), but its failing because the balancer is not running
[17:27:35] <devastor> We just got "Tue Jun 4 17:08:29 [repl writer worker 2] production.messages Assertion failure type == cbindata src/mongo/db/key.cpp 585" with mongo db 2.2.3, any known bugs that might cause that?
[17:27:55] <devastor> It crashed mongo and it doesn't start again
[17:34:41] <devastor> [repl writer worker 2] ERROR: writer worker caught exception: assertion src/mongo/db/key.cpp:585 on { long document data here or part of it }
[17:34:59] <devastor> [repl writer worker 2] Fatal Assertion 16360 dies to this
[17:37:34] <starfly> failshell: Dude, most folks probably don't want to weigh in on that dicey situation. I'd guess that they only way you can effectively recover is to build an environment that looks like (operating system, hostnames, hardware) very similar to what you backed up previously and recover into that. Comingling old and new shards sounds like a losing proposition.
[17:38:30] <failshell> so all the config points to wrong hosts
[17:39:30] <starfly> failshell: so, build the staging environment out like it was production. You can use the same hostnames, just can't obviously do that with DNS, but only local to the staging hosts...
[20:36:59] <miskander> I need to find all records based on an attributes in a different collection. For example find all comments where the user role is 'author'
[20:37:16] <jpfarias__> I'm not sure if you are the right person for this, but why would mongo take too long to answer for some query, even when the index is there?
[20:40:52] <niemeyer> jpfarias__: Not sure.. it's loaded 30k objects, and hasn't used indexes
[20:41:24] <jpfarias__> so should I use a different query then?
[20:41:35] <jpfarias__> something like a box would be better?
[20:41:53] <jpfarias__> I didn't know $center didn't use the index
[20:42:35] <niemeyer> jpfarias__: I have never used the geo features of MongoDB, so I'm not a good person to walk you through this, but I'm telling you what that query you made is showing you
[20:43:11] <niemeyer> jpfarias__: Do you have a 2d index in the proper fields?
[20:59:57] <niemeyer> jpfarias__: Again, without knowledge of the implementation, it shouldn't make a difference if the 2d index is indeed in an Euclidian space
[21:00:33] <jpfarias__> yeah that's what I thought too
[21:01:02] <niemeyer> jpfarias__: In other words, if it's flat and bounded on -180-180 on both values
[21:01:03] <jpfarias__> but since you made the geohash algorithm I thought you'd know better :)
[21:02:41] <niemeyer> jpfarias__: In the sense the coordinates will be completely wrong
[21:03:40] <niemeyer> jpfarias__: If everything just works on a flat 2d space, and you swap x/y, it'd still work, though.. but given the comment in the page, I'd ask someone that knows better than I do
[21:04:18] <niemeyer> jpfarias__: Once you start using spherical coordinates, that'll all be completely bogus for sure
[21:08:24] <jpfarias__> but would that be the reason the query is slow?
[21:12:24] <niemeyer> jpfarias__: That query is running of 30k documents, yielding 600 times.. I don't know where you're running this, but it's definitely not surprising that it's misperforming.
[21:25:41] <niemeyer> jpfarias__: Have you investigated queries around the latitude boundary (90 or -90)?
[21:26:13] <niemeyer> jpfarias__: The docs say it's the limits are bounded at 180, but I wouldn't be surprised if part of the logic is actually enforcing the correct boundaries.
[21:26:27] <niemeyer> jpfarias__: Strictly speaking your query has an invalid latitude.
[21:58:14] <jcalvinowens> Quick question: is there a way to issue raw mongo shell commands from pymongo? Want to use forEach(), but not callable from Python.
[21:58:35] <jpfarias__> tomlikestorock: does it have any performance penalty for the $in instead of the big regex?
[21:58:52] <tomlikestorock> jpfarias__: no, it seems to use the appropriate indexes and be fine