[00:09:02] <maginot> Hello. In my DB when I execute db.markers.find() I have this result: http://pastebin.com/YATr8kJS but when I try something like db.markers.find({ user_info : { email : 'user@gmail.com' }}) it returns no results
[00:10:41] <maginot> I can't understand why, maybe someone with more experience can take a look and find what I'm missing here, thanks
[00:16:23] <joannac> maginot: I'm not an expert by any means, but I think it needs to match exactly
[00:16:41] <joannac> e.g. http://docs.mongodb.org/manual/core/read/ (check the section saying "exact matches")
[00:17:00] <joannac> yours doesn't match cos it doesn't have the "name" field
[00:31:27] <maginot> joannac: hello, thanks for the reply, I was adviced in stackoverflow to use dot notation, and now I understand what was happening wrong. Thanks
[00:32:29] <joannac> maginot: no probs. I'm only just a newbie myself :)
[00:35:11] <maginot> joannac: so let me share the solution, I should be doing db.markers.find( { 'user_info.email' : 'user@gmail.com' })
[00:35:43] <joannac> yeah... in the same doc i linked, further down, has the same thing
[00:35:54] <joannac> if you want partial matches in subdocs you need dot notation
[01:56:38] <doktrin> can anyone point me to a good resource (including freenode channels) to learn morea bout geospatial indexing?
[02:15:09] <lyetz> Can someone advise me on the best way to query a nested array within a nested array?
[02:15:12] <lyetz> I'm trying to query the salePrice field on this schema - http://cl.ly/image/1x2T0d1i2Y22
[02:15:31] <lyetz> I can use $elemMatch to query directly within the en schema, but not the pricing schema within the en schema
[02:23:35] <codezombie> Would mongodb be well suited for storing metrics? Right now we're storing them in mysql, and I feel this limits our flexibility when we need to change what we collect, or if we want to shift things around at all.
[02:26:44] <preaction> codezombie, i plan on using capped collections for exactly that, or have you considered opentsdb or rrdtool?
[02:27:18] <codezombie> preaction: I have not, honestly mongodb was the first thing that popped into my head. I've got a meeting @ 9am to try and sell this to the managers.
[02:27:40] <preaction> my company uses opentsdb for it, but it's java so ew. rrdtool is C, but it's got some bit of learning curve
[02:27:51] <fogus> Is there a "show create collection"?
[02:27:58] <preaction> that said, if you're already using mongo in other places, mongo is probably a good idea (if you're already replicating things, for example)
[02:28:22] <codezombie> preaction: not currently using mongo. They're storing all this in mysql at the moment.
[02:28:52] <preaction> fogus, collections aren't created, they spawn as necessary. perhaps you want the indexes?
[02:30:02] <preaction> codezombie, so i'd take a look at all three, mongo, opentsdb, and rrdtool (and its sisters munin and cacti), and make a decision. opentsdb is pretty easy to get going with, from what i've seen (its got a robust query/graph thing, and you just save the URLs for the graphs you want)
[03:15:51] <marcosnils> hi there, I've found a weird behaviour when trying to update multiple records using the forEach function. Anyone here who may lend me a hand?
[06:29:48] <heloyou> http://pastebin.com/LWQ5Dh7e .. how can i search for a document with things.name:foo1 and things.value:123 ? if i search for things.name:foo2 and things.value:123 i still get a result...hm
[08:50:06] <diffuse> Is there any benefit to deduping records prior to issuing a batch insert to a collection that has a unique index with dropDups enabled?
[09:00:31] <diffuse> well, i only plan on deduping what is about to be inserted
[09:01:00] <diffuse> i am not going to dedup my list against the db
[09:01:39] <diffuse> I just noticed that inserting a dupe into mongo can be slow
[09:13:28] <silasdavis> Suppose I have some naturally hierarchical fields, like 'country', 'county', 'town'
[09:13:55] <silasdavis> and I know that I will always have all three values when performing a lookup
[09:15:32] <silasdavis> what are the relative merits of describing all three in a 'location' document: { "country": ..., "county": ..., "town":...} vs three embedded documents
[09:24:04] <Nodex> http://docs.mongodb.org/manual/core/indexes/#compound-indexes <--- look at the example
[09:24:25] <Nodex> specifically the bit after "The index, however, would not support queries that select the following"
[09:26:20] <silasdavis> Nodex, ah so in fact it is a hierarchical index, I misread
[09:26:49] <silasdavis> I'd expect them to be very similar then
[09:27:12] <silasdavis> I don't see why this index would be excessively large
[09:27:53] <Nodex> LOL ok. I have put the geonames database into mongo and applied an index similar to what you are asking and the index is about 150gb
[09:28:14] <Nodex> now I don't have 150gb of ram so my indexes will be missing a large portion of the time
[09:28:33] <silasdavis> what fields were in your index?
[09:29:37] <Nodex> one was country, one was place, I can't remember the other
[09:29:42] <silasdavis> it might, also I'd be interested to replicate
[09:30:04] <Nodex> you want to replicate the data or shard it?
[09:30:43] <silasdavis> no I mean replicate what your saying about a 150gb index
[09:30:56] <Nodex> so you want 2 machines with 150gb indexes?
[09:31:21] <Nodex> oh sorry, I thought you were referring to replica sets
[09:32:01] <Nodex> do what you need to do. I have a lot of very extensive experience when it comes to area data and mongodb so you can either take my advice or not :)
[09:33:31] <silasdavis> Nodex, was it compound index or a geo index?
[09:36:15] <Nodex> the size difference is not that much between the 2, I have tried both
[09:43:44] <silasdavis> I just had a look at the back-of-an-envelope formula for index size: http://stackoverflow.com/questions/8607637/are-there-any-tools-to-estimate-index-size-in-mongodb
[09:44:04] <silasdavis> the allCountries geoname data has 8434481
[09:44:50] <silasdavis> suppose it's UTF-8 encoded, then worst case each character takes 6 bytes to encode, and suppose our average indexed field length is 20
[09:45:09] <silasdavis> (this should be an overestimate)
[09:45:32] <silasdavis> then according to that the index size would be around (2*8434481*(18+60+5)) bytes
[09:46:32] <Nodex> guess you have it all figured out then :)
[09:46:34] <silasdavis> I'm not sure how being compound wold affect this. I guess for every parent value in the index we would need another smaller index
[09:46:57] <silasdavis> Nodex, clearly I don't, I'm just trying to get to the bottom of this
[09:48:06] <silasdavis> because there seems to be a difference of two orders of magnitude between that estimate and what you have observed
[09:49:28] <Nodex> I guess I should rephrase "index" to "working set"
[09:50:24] <Nodex> 1.8M UK postcodes in a compound index = 250mb index and 650mb storage size
[10:04:33] <Nodex> sometimes I miss his weird sense of humour
[10:04:43] <Nodex> but most of the time, not so much
[10:04:46] <remonvv> I ran into one of Antoine's blogs and he responded with "There are specialized storages like Riak & friends for exactly this purpose. Why do you need or want to bend MongoDB that? Do you know what you are doing?"
[10:46:29] <lpin> the downside is that when the author want to edit the item for example changing an answer the whole document has to be updated
[10:58:01] <remonvv> lpin, not sure if I missed part of the conversation but my opinion is that you should definitely not go for embedding everything.
[10:58:11] <remonvv> Nor should your schema affect your API design and vice versa by the way
[11:01:41] <lpin> remonvv so you are suggesting to use a different collection for each resource? e.g. the items collection the subitems collection and the answers collection?
[11:06:09] <remonvv> No I'm suggesting to design your API to be as easy to use as possible for the developers or clients that consume your API and make informed decisions on when and when not to embed.
[11:06:25] <remonvv> There are a few guidelines on when and when not to embed.
[11:10:37] <lpin> remonvv, i'm struggling to find the way to go, but anyway thank you :>
[11:10:55] <remonvv> Is it a generic system or do you know which collection is for what purpose?
[11:11:40] <remonvv> If it's generic you have to go for seperate collections really. If you don't know what the bounds are of the documents in a sub collection you cannot embedded.
[11:12:12] <remonvv> Embedded structures should be relatively small to avoid costly updates.
[11:13:27] <Nodex> lpin : to clarify I was saying that ONE document per question is fine - not ONE document period
[11:13:28] <lpin> well this isn't much of a concern an item can have at max 10 subitems and a subitem can have at max 4 answers
[11:19:30] <remonvv> lpin, that sounds okay to embed then.
[11:25:20] <lpin> remonvv also when editing, the author should see the item in its whole so i don't need a great granularity. Something like PUT http://mydomain.com/items/123/subitems/123/answers/1 to edit a single answer doesn't have much sense
[11:26:26] <lpin> but PUT the whole document when editing a single answer could be overkill indeed
[11:27:06] <lpin> i should hire someone for consulting :>
[11:33:45] <lpin> i have to go home now, thank you remonvv and Nodex
[11:42:50] <twrivera> anyone willing to answer some mongo topology design questions?
[12:37:01] <game_admin_231> hello :), I am trying to sync a new replicaset member to my mongo rs. I checked the logs of my primary node and also of the new node.the log of the new node says all the time: [journal] DR101 latency warning on journal file open 17507ms. Is my storage system too slow or can I ignore such a output?
[12:44:15] <ak5> hi, I have a mongod running on a qemu vm, it can't allocate space - why is this?
[13:51:44] <Nodex> I can feel a nice blog coming on about how to implement point in time backups with MongoDB - as in only foo,bar,baz have changed since last time so we only need to back those up
[14:07:57] <marcqualie> Is there a simple way to check the health and how in sync config servers are? I can't seem to find any docs on the config servers apart from initially setting them up
[14:30:21] <joe_p> marcqualie: use config; db.runCommand({dbHash:1})
[14:36:15] <marcqualie> joe_p: what does that do? The documentation simply says "dbHash is an internal command" and nothing else
[14:41:27] <joe_p> marcqualie: run that on your config servers to see if the data is in sync
[14:42:08] <joe_p> (08:05:03 AM) marcqualie: Is there a simple way to check the health and how in sync config servers are?
[14:45:36] <marcqualie> Awesome! Thanks joe_p, I'll hook this into monitoring somehow to keep check on it
[14:51:09] <jeffwhelpley> Is there anywhere I can look to see Mongo schema designs that other people have set up for standard stuff like a users collection (i.e. with username, password hash, etc.)? I am designing a new schema and I don't want to re-invent the wheel. I have obviously Googled but haven't found anything good yet.
[14:58:00] <Nodex> I suggest you do it according to your app - it will teach you a little of data structures too
[15:00:57] <jwang> jeffwhepley: if you have a dictionary/hash representation of your user, you can start with that and adapt as necessary
[15:04:01] <toothrot> i found an error in my logs this morning on a group(): "JavaScript execution terminated".. what can cause that? It's only happened once...
[15:10:10] <jeffwhelpley> @Nodex and @jwang, totally understand and agree. Perhaps that example convoluted what I am really interested in. I am new to mongodb and I am really just interested in seeing real-world examples of schema designs that other people have done for real apps used in production. I have read about the principals of NoSql schema design, but I learn more from seeing real world examples.
[15:11:12] <kali> i'm not sure you would learn much if i was to show you my users (which i won't do anyway)
[15:13:06] <Nodex> the good thing about mongo is you can start small and add to it
[15:13:21] <Nodex> and it's easy to do - i/e NO alter table statements or large migrations
[15:16:36] <Nodex> http://pastebin.com/9kkbBMxU <---- there is an example of one of mine if it helps
[15:18:33] <jeffwhelpley> YES, that does help! OK, so this is exactly what I was hoping to see. For example, you have a groups array in your users collection that has the name of the group. I am assuming there is a separate groups collection, so I am wondering why you don't have the _id value of that other collection in there as well as the name.
[15:18:52] <jeffwhelpley> or is it that you use the name as the _id?
[15:19:59] <Nodex> because I don't need the ID of the group
[15:20:35] <Nodex> I will give you the best piece of advice ever for learning mongodb
[15:20:44] <Nodex> well non relational stores in general
[15:20:57] <Nodex> work out how to do it in an RDBMS and do the opposite
[15:22:44] <jeffwhelpley> ha, good advice. I guess my mindset is still RDBMS in some ways
[15:22:48] <Nodex> relating data for the hell of it (i/e groups to ID's in a join somewhere) really is pointless - how often do you change a group name that it becomes worth Joining and not staticaly storing
[15:23:11] <Nodex> same as a username or a forename/surname ... date of birth - these things don't change all that often
[15:28:07] <jeffwhelpley> That makes a lot of sense. Thanks a lot!
[17:03:16] <Almindor> if you use 2d index, can you have only one extra value for a compound?
[17:04:05] <Almindor> we have 200 million documents but usually querry on position, time (descending mostly), foreign id and type (mostly in this order)
[17:04:22] <Almindor> is it possible to have that kind of index?
[17:07:42] <Almindor> also I tried to use geoHaystack (with the time as 3rd) and I got this error on a query: "errmsg" : "exception: assertion src/mongo/db/geo/haystack.cpp:178",
[17:07:54] <Almindor> it's not possible to querry a box on geoHaystack?
[17:26:50] <RomainT1> I've posted my question on SO: http://stackoverflow.com/q/16344002/432929
[18:48:35] <prop> I'm running a mapReduce on multiple collections into a single overview collecting, using the reduce action to combine values with existing keys. The problem is I want to conditionally apply different logic when reducing, depending on the source collection or if its an existing value.
[18:49:44] <prop> Is there anyway to discern the source or a good structure for achieving the same result? I was setting a source field on each value in the different mappers and in the finalize function, but because finalize is called multiple times, the actual condition is being obscured.
[18:56:48] <themoebius_> in the aggregation framework, is it possible to get the number of elements in an array?
[18:57:02] <themoebius_> like if I want to sum the number of elements in the array for each document?
[19:07:15] <leifw> themoebius_: I think you can use $unwind and then $group with $sum
[19:07:23] <leifw> themoebius_: but I'm not sure exactly, I haven't tried
[19:07:59] <themoebius_> leifw: i think you're right
[19:14:23] <themoebius_> leifw: how about the number of elements in a hash?
[20:26:50] <JoeyJoeJo> In my documents I have one field which is an array that holds latitude and longitude. I want to find all documents where the latitude is less than 0. How do I search for that?
[20:28:22] <Gargoyle> JoeyJoeJo: {doc.location.lat: {$lt: 0}} I think of the top of my head. not done any mongo since xmas.
[20:28:42] <Gargoyle> JoeyJoeJo: Assuming your lat is in doc.location.lat