PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 2nd of May, 2013

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:09:02] <maginot> Hello. In my DB when I execute db.markers.find() I have this result: http://pastebin.com/YATr8kJS but when I try something like db.markers.find({ user_info : { email : 'user@gmail.com' }}) it returns no results
[00:10:41] <maginot> I can't understand why, maybe someone with more experience can take a look and find what I'm missing here, thanks
[00:16:23] <joannac> maginot: I'm not an expert by any means, but I think it needs to match exactly
[00:16:41] <joannac> e.g. http://docs.mongodb.org/manual/core/read/ (check the section saying "exact matches")
[00:17:00] <joannac> yours doesn't match cos it doesn't have the "name" field
[00:31:27] <maginot> joannac: hello, thanks for the reply, I was adviced in stackoverflow to use dot notation, and now I understand what was happening wrong. Thanks
[00:32:29] <joannac> maginot: no probs. I'm only just a newbie myself :)
[00:35:11] <maginot> joannac: so let me share the solution, I should be doing db.markers.find( { 'user_info.email' : 'user@gmail.com' })
[00:35:43] <joannac> yeah... in the same doc i linked, further down, has the same thing
[00:35:54] <joannac> if you want partial matches in subdocs you need dot notation
[01:56:38] <doktrin> can anyone point me to a good resource (including freenode channels) to learn morea bout geospatial indexing?
[02:15:09] <lyetz> Can someone advise me on the best way to query a nested array within a nested array?
[02:15:12] <lyetz> I'm trying to query the salePrice field on this schema - http://cl.ly/image/1x2T0d1i2Y22
[02:15:31] <lyetz> I can use $elemMatch to query directly within the en schema, but not the pricing schema within the en schema
[02:23:35] <codezombie> Would mongodb be well suited for storing metrics? Right now we're storing them in mysql, and I feel this limits our flexibility when we need to change what we collect, or if we want to shift things around at all.
[02:26:44] <preaction> codezombie, i plan on using capped collections for exactly that, or have you considered opentsdb or rrdtool?
[02:27:18] <codezombie> preaction: I have not, honestly mongodb was the first thing that popped into my head. I've got a meeting @ 9am to try and sell this to the managers.
[02:27:40] <preaction> my company uses opentsdb for it, but it's java so ew. rrdtool is C, but it's got some bit of learning curve
[02:27:51] <fogus> Is there a "show create collection"?
[02:27:58] <preaction> that said, if you're already using mongo in other places, mongo is probably a good idea (if you're already replicating things, for example)
[02:28:22] <codezombie> preaction: not currently using mongo. They're storing all this in mysql at the moment.
[02:28:52] <preaction> fogus, collections aren't created, they spawn as necessary. perhaps you want the indexes?
[02:30:02] <preaction> codezombie, so i'd take a look at all three, mongo, opentsdb, and rrdtool (and its sisters munin and cacti), and make a decision. opentsdb is pretty easy to get going with, from what i've seen (its got a robust query/graph thing, and you just save the URLs for the graphs you want)
[03:15:51] <marcosnils> hi there, I've found a weird behaviour when trying to update multiple records using the forEach function. Anyone here who may lend me a hand?
[06:29:48] <heloyou> http://pastebin.com/LWQ5Dh7e .. how can i search for a document with things.name:foo1 and things.value:123 ? if i search for things.name:foo2 and things.value:123 i still get a result...hm
[06:32:36] <heloyou> ah neat things:{name:foo1,value:123}
[07:37:06] <[AD]Turbo> hi there
[08:50:06] <diffuse> Is there any benefit to deduping records prior to issuing a batch insert to a collection that has a unique index with dropDups enabled?
[08:50:20] <diffuse> performance wise?
[08:57:46] <remonvv> \o
[08:58:27] <remonvv> Are you asking if manual de-dupe + batch is faster than batch with dropDups on the index?
[09:00:04] <diffuse> yes
[09:00:31] <diffuse> well, i only plan on deduping what is about to be inserted
[09:01:00] <diffuse> i am not going to dedup my list against the db
[09:01:39] <diffuse> I just noticed that inserting a dupe into mongo can be slow
[09:13:28] <silasdavis> Suppose I have some naturally hierarchical fields, like 'country', 'county', 'town'
[09:13:55] <silasdavis> and I know that I will always have all three values when performing a lookup
[09:15:32] <silasdavis> what are the relative merits of describing all three in a 'location' document: { "country": ..., "county": ..., "town":...} vs three embedded documents
[09:16:29] <silasdavis> "country": { "name":..., "counties": [...]}
[09:16:46] <Nodex> eh?
[09:16:54] <silasdavis> "county": {"name":..., "towns":[...]} etc
[09:17:24] <silasdavis> so i coudl have a single document with a compound index
[09:17:28] <Nodex> it really depends on your query patterns
[09:17:38] <silasdavis> or I could have a hierarchy of documents
[09:17:57] <silasdavis> so as I said i will always have three values to lookup against
[09:18:13] <silasdavis> so I can always start with country, then county, then town if i want to
[09:18:25] <silasdavis> and I won't ever just have to look up town
[09:18:34] <silasdavis> I'd guess the hierarchy would be quicker
[09:18:42] <silasdavis> I'm just wondering what the tradeoff would be
[09:19:03] <Nodex> again it depends on your query patterns
[09:19:16] <silasdavis> what more do you need to know about the query patterns?
[09:19:40] <Nodex> err how you access your data
[09:20:05] <Nodex> is EVERY query allways going to have 3 fields?
[09:20:26] <silasdavis> yes
[09:20:34] <Nodex> so your index will be very very very large then
[09:21:14] <silasdavis> in which case
[09:21:22] <silasdavis> with the compound index?
[09:21:38] <Nodex> in all cases, you're indexing on three fields in a compound or a single key - it's still 3 values
[09:22:46] <silasdavis> well the compound index must be larger because it will allow me to look up with any subset of those fields
[09:22:59] <Nodex> that's not how a compound index works
[09:23:09] <Nodex> you can lookup on all or 1 not all 3
[09:23:18] <Nodex> a,b,c or just a
[09:24:04] <Nodex> http://docs.mongodb.org/manual/core/indexes/#compound-indexes <--- look at the example
[09:24:25] <Nodex> specifically the bit after "The index, however, would not support queries that select the following"
[09:26:20] <silasdavis> Nodex, ah so in fact it is a hierarchical index, I misread
[09:26:49] <silasdavis> I'd expect them to be very similar then
[09:27:12] <silasdavis> I don't see why this index would be excessively large
[09:27:53] <Nodex> LOL ok. I have put the geonames database into mongo and applied an index similar to what you are asking and the index is about 150gb
[09:28:14] <Nodex> now I don't have 150gb of ram so my indexes will be missing a large portion of the time
[09:28:33] <silasdavis> what fields were in your index?
[09:28:38] <Nodex> 3 fields
[09:29:07] <silasdavis> what were they?
[09:29:22] <Nodex> it doesnt really matter
[09:29:37] <Nodex> one was country, one was place, I can't remember the other
[09:29:42] <silasdavis> it might, also I'd be interested to replicate
[09:30:04] <Nodex> you want to replicate the data or shard it?
[09:30:43] <silasdavis> no I mean replicate what your saying about a 150gb index
[09:30:56] <Nodex> so you want 2 machines with 150gb indexes?
[09:31:21] <Nodex> oh sorry, I thought you were referring to replica sets
[09:32:01] <Nodex> do what you need to do. I have a lot of very extensive experience when it comes to area data and mongodb so you can either take my advice or not :)
[09:33:31] <silasdavis> Nodex, was it compound index or a geo index?
[09:36:15] <Nodex> the size difference is not that much between the 2, I have tried both
[09:43:44] <silasdavis> I just had a look at the back-of-an-envelope formula for index size: http://stackoverflow.com/questions/8607637/are-there-any-tools-to-estimate-index-size-in-mongodb
[09:44:04] <silasdavis> the allCountries geoname data has 8434481
[09:44:07] <silasdavis> records
[09:44:50] <silasdavis> suppose it's UTF-8 encoded, then worst case each character takes 6 bytes to encode, and suppose our average indexed field length is 20
[09:45:09] <silasdavis> (this should be an overestimate)
[09:45:32] <silasdavis> then according to that the index size would be around (2*8434481*(18+60+5)) bytes
[09:45:41] <silasdavis> about 140 MB
[09:46:19] <Nodex> awesome
[09:46:32] <Nodex> guess you have it all figured out then :)
[09:46:34] <silasdavis> I'm not sure how being compound wold affect this. I guess for every parent value in the index we would need another smaller index
[09:46:57] <silasdavis> Nodex, clearly I don't, I'm just trying to get to the bottom of this
[09:48:06] <silasdavis> because there seems to be a difference of two orders of magnitude between that estimate and what you have observed
[09:49:28] <Nodex> I guess I should rephrase "index" to "working set"
[09:50:24] <Nodex> 1.8M UK postcodes in a compound index = 250mb index and 650mb storage size
[10:00:22] <Nodex> you're welcome
[10:03:45] <remonvv> Is MacYET still active here?
[10:03:57] <Nodex> he's perma banned
[10:04:10] <Nodex> on like 3 different IP's
[10:04:19] <remonvv> lol
[10:04:33] <Nodex> sometimes I miss his weird sense of humour
[10:04:43] <Nodex> but most of the time, not so much
[10:04:46] <remonvv> I ran into one of Antoine's blogs and he responded with "There are specialized storages like Riak & friends for exactly this purpose. Why do you need or want to bend MongoDB that? Do you know what you are doing?"
[10:04:51] <remonvv> 10gen's Antoine, mind you
[10:05:16] <Nodex> hahahahahahah
[10:05:29] <remonvv> I still suspect he's a CouchDB committer.
[10:05:30] <Nodex> I suppose it's a fair question
[10:05:46] <remonvv> Oh it is. But the way he asks is...well...socially challenged.
[10:06:34] <Nodex> some people are just like that - they don't know that it's the wrong way to interact with peopel
[10:06:46] <Nodex> and also - sarcasm get's lost on the internet
[10:06:55] <Nodex> it's not UTF-8 Compatible
[10:30:13] <lpin> hello guys, i'm designing an API for a bank of items used in a language assessment
[10:31:17] <lpin> i'm tempted of using mongo as the db
[10:31:45] <lpin> is it crazy to use a single collection to store a document like this? http://hastebin.com/fikeduxavu.sm
[10:34:37] <lpin> one item has 1 to many sub items and each sub item has 1 to many answers
[10:35:16] <lpin> is it crazy to embed everything?
[10:36:20] <Nodex> not really
[10:36:39] <Nodex> a question can have multiple ways of answering it as sub docs
[10:41:58] <lpin> that would make the API straightforward
[10:42:35] <lpin> GET http://mydomain.com/items
[10:42:52] <lpin> GET http://mydomain.com/items/:id
[10:43:24] <lpin> POST http://mydomain.com/items
[10:43:26] <lpin> etc...
[10:46:29] <lpin> the downside is that when the author want to edit the item for example changing an answer the whole document has to be updated
[10:58:01] <remonvv> lpin, not sure if I missed part of the conversation but my opinion is that you should definitely not go for embedding everything.
[10:58:11] <remonvv> Nor should your schema affect your API design and vice versa by the way
[11:01:41] <lpin> remonvv so you are suggesting to use a different collection for each resource? e.g. the items collection the subitems collection and the answers collection?
[11:06:09] <remonvv> No I'm suggesting to design your API to be as easy to use as possible for the developers or clients that consume your API and make informed decisions on when and when not to embed.
[11:06:25] <remonvv> There are a few guidelines on when and when not to embed.
[11:10:37] <lpin> remonvv, i'm struggling to find the way to go, but anyway thank you :>
[11:10:55] <remonvv> Is it a generic system or do you know which collection is for what purpose?
[11:11:40] <remonvv> If it's generic you have to go for seperate collections really. If you don't know what the bounds are of the documents in a sub collection you cannot embedded.
[11:12:12] <remonvv> Embedded structures should be relatively small to avoid costly updates.
[11:13:27] <Nodex> lpin : to clarify I was saying that ONE document per question is fine - not ONE document period
[11:13:28] <lpin> well this isn't much of a concern an item can have at max 10 subitems and a subitem can have at max 4 answers
[11:19:30] <remonvv> lpin, that sounds okay to embed then.
[11:20:40] <Nodex> ycombinator is very slow today
[11:25:20] <lpin> remonvv also when editing, the author should see the item in its whole so i don't need a great granularity. Something like PUT http://mydomain.com/items/123/subitems/123/answers/1 to edit a single answer doesn't have much sense
[11:26:26] <lpin> but PUT the whole document when editing a single answer could be overkill indeed
[11:26:47] <lpin> i mean the whole item
[11:27:06] <lpin> i should hire someone for consulting :>
[11:33:45] <lpin> i have to go home now, thank you remonvv and Nodex
[11:42:50] <twrivera> anyone willing to answer some mongo topology design questions?
[12:37:01] <game_admin_231> hello :), I am trying to sync a new replicaset member to my mongo rs. I checked the logs of my primary node and also of the new node.the log of the new node says all the time: [journal] DR101 latency warning on journal file open 17507ms. Is my storage system too slow or can I ignore such a output?
[12:44:15] <ak5> hi, I have a mongod running on a qemu vm, it can't allocate space - why is this?
[13:51:44] <Nodex> I can feel a nice blog coming on about how to implement point in time backups with MongoDB - as in only foo,bar,baz have changed since last time so we only need to back those up
[14:07:57] <marcqualie> Is there a simple way to check the health and how in sync config servers are? I can't seem to find any docs on the config servers apart from initially setting them up
[14:30:21] <joe_p> marcqualie: use config; db.runCommand({dbHash:1})
[14:36:15] <marcqualie> joe_p: what does that do? The documentation simply says "dbHash is an internal command" and nothing else
[14:41:27] <joe_p> marcqualie: run that on your config servers to see if the data is in sync
[14:42:08] <joe_p> (08:05:03 AM) marcqualie: Is there a simple way to check the health and how in sync config servers are?
[14:45:36] <marcqualie> Awesome! Thanks joe_p, I'll hook this into monitoring somehow to keep check on it
[14:51:09] <jeffwhelpley> Is there anywhere I can look to see Mongo schema designs that other people have set up for standard stuff like a users collection (i.e. with username, password hash, etc.)? I am designing a new schema and I don't want to re-invent the wheel. I have obviously Googled but haven't found anything good yet.
[14:58:00] <Nodex> I suggest you do it according to your app - it will teach you a little of data structures too
[15:00:57] <jwang> jeffwhepley: if you have a dictionary/hash representation of your user, you can start with that and adapt as necessary
[15:04:01] <toothrot> i found an error in my logs this morning on a group(): "JavaScript execution terminated".. what can cause that? It's only happened once...
[15:10:10] <jeffwhelpley> @Nodex and @jwang, totally understand and agree. Perhaps that example convoluted what I am really interested in. I am new to mongodb and I am really just interested in seeing real-world examples of schema designs that other people have done for real apps used in production. I have read about the principals of NoSql schema design, but I learn more from seeing real world examples.
[15:11:12] <kali> i'm not sure you would learn much if i was to show you my users (which i won't do anyway)
[15:11:16] <kali> real life is messy
[15:13:06] <Nodex> the good thing about mongo is you can start small and add to it
[15:13:21] <Nodex> and it's easy to do - i/e NO alter table statements or large migrations
[15:16:36] <Nodex> http://pastebin.com/9kkbBMxU <---- there is an example of one of mine if it helps
[15:18:33] <jeffwhelpley> YES, that does help! OK, so this is exactly what I was hoping to see. For example, you have a groups array in your users collection that has the name of the group. I am assuming there is a separate groups collection, so I am wondering why you don't have the _id value of that other collection in there as well as the name.
[15:18:52] <jeffwhelpley> or is it that you use the name as the _id?
[15:19:59] <Nodex> because I don't need the ID of the group
[15:20:35] <Nodex> I will give you the best piece of advice ever for learning mongodb
[15:20:44] <Nodex> well non relational stores in general
[15:20:57] <Nodex> work out how to do it in an RDBMS and do the opposite
[15:21:06] <Nodex> (for the most part anyway)
[15:22:44] <jeffwhelpley> ha, good advice. I guess my mindset is still RDBMS in some ways
[15:22:48] <Nodex> relating data for the hell of it (i/e groups to ID's in a join somewhere) really is pointless - how often do you change a group name that it becomes worth Joining and not staticaly storing
[15:23:11] <Nodex> same as a username or a forename/surname ... date of birth - these things don't change all that often
[15:28:07] <jeffwhelpley> That makes a lot of sense. Thanks a lot!
[15:28:24] <Nodex> no probs
[16:23:37] <RomainT1> Hi
[16:26:44] <RomainT1> I have troubles writing a query.
[16:26:56] <RomainT1> With such a document in my collection https://gist.github.com/Sephi-Chan/83beafac66c6bec09b88
[16:27:22] <RomainT1> I would like to create a query that matches the exact value of the player_id of each guest
[16:27:46] <RomainT1> (the goal for me is to find any document that have the exact same guests)
[16:29:10] <RomainT1> Any idea about how to achieve that?
[16:48:38] <praxis_> Hello... is it possible to do ordered updates? I need to $inc a field $gte 5 that has a unique index on it
[16:54:36] <praxis_> hmm, slow day...:)
[17:03:16] <Almindor> if you use 2d index, can you have only one extra value for a compound?
[17:04:05] <Almindor> we have 200 million documents but usually querry on position, time (descending mostly), foreign id and type (mostly in this order)
[17:04:22] <Almindor> is it possible to have that kind of index?
[17:07:42] <Almindor> also I tried to use geoHaystack (with the time as 3rd) and I got this error on a query: "errmsg" : "exception: assertion src/mongo/db/geo/haystack.cpp:178",
[17:07:54] <Almindor> it's not possible to querry a box on geoHaystack?
[17:26:50] <RomainT1> I've posted my question on SO: http://stackoverflow.com/q/16344002/432929
[18:48:35] <prop> I'm running a mapReduce on multiple collections into a single overview collecting, using the reduce action to combine values with existing keys. The problem is I want to conditionally apply different logic when reducing, depending on the source collection or if its an existing value.
[18:49:44] <prop> Is there anyway to discern the source or a good structure for achieving the same result? I was setting a source field on each value in the different mappers and in the finalize function, but because finalize is called multiple times, the actual condition is being obscured.
[18:56:48] <themoebius_> in the aggregation framework, is it possible to get the number of elements in an array?
[18:57:02] <themoebius_> like if I want to sum the number of elements in the array for each document?
[19:07:15] <leifw> themoebius_: I think you can use $unwind and then $group with $sum
[19:07:23] <leifw> themoebius_: but I'm not sure exactly, I haven't tried
[19:07:59] <themoebius_> leifw: i think you're right
[19:14:23] <themoebius_> leifw: how about the number of elements in a hash?
[19:14:43] <leifw> no idea sorry
[19:21:41] <whaley> how can you determine what character encoding mongodb is using?
[19:23:45] <whaley> afaict it ought to be utf-8, but I have read that some packages may not have utf-8 support and tbh I just want to be doubly sure.
[19:23:59] <ron> whaley: seriously?
[19:24:04] <whaley> ron: why?
[19:24:16] <ron> I don't know WHY you're stalking me.
[19:24:29] <whaley> ron: because you are in EVERY GODDAMNED FREENODE CHANNEL
[19:24:48] <ron> whaley: well, that's because I'm just so FUCKING AWESOME
[19:25:05] <whaley> ron: or pathetic. I haven't figured out which yet.
[19:25:13] <whaley> ron: anyway... earn your paycheck and mock me for my question please
[19:25:16] <ron> well, we both know the real answer.
[19:25:25] <whaley> ron: "both" :P
[19:25:34] <ron> I was going for pathetic, but sure.
[19:25:43] <whaley> ron: pathetically awesome?
[19:25:49] <ron> no. just pathetic.
[19:26:05] <whaley> anyway, I stumbled on the bson spec that claims String types are utf-8
[19:26:05] <ron> so, did you try... oh.. I dunno.. googling that?
[19:26:29] <whaley> ron: yes, I always google before asking
[19:26:35] <ron> are you sure?
[19:26:38] <whaley> ron: I didn't think to check bson docs as opposed to mongo docs
[19:27:28] <whaley> ron: do I need to show you browser history?
[19:27:44] <ron> whaley: I don't need to know your porn preferences.
[19:27:56] <whaley> ron: neither does my wife
[19:27:59] <whaley> ron: but that's beside the point
[19:28:03] <ron> whaley: well, if it makes you feel better, I've safely stored Russian character in mongodb.
[19:28:37] <whaley> ron: also... I'm no idiot. I use Private Browsing for that stuff
[19:28:43] <whaley> so it wouldn't be in my history
[19:28:56] <ron> heh, that's what they want you to believe
[19:32:54] <ron> whaley: did you find the answer?
[19:33:14] <whaley> ron: " I stumbled on the bson spec that claims String types are utf-8"
[19:33:34] <ron> whaley: and you fear the spec lies?
[19:33:49] <whaley> ron: nope, I'm cool with that as an answer
[19:34:14] <ron> whaley: then why are you still here?
[19:34:21] <whaley> ron: stalking you.
[19:34:48] <ron> whaley: at least you're not stalking me on fb.
[19:34:57] <whaley> ron: real answer... I should lurk here, as I basically use mongodb for every project atm
[19:35:11] <ron> \o/
[19:35:43] <kali> somebody said porn ?
[19:36:16] <ron> I guess it figures the french guy will wake up at the mention of porn.
[19:36:29] <kali> utf8 porn ?
[19:37:35] <ron> yes. yes it is.
[19:37:52] <whaley> woops.
[19:39:44] <ron> kali: forgot to ask you - wanna hire me?
[19:41:01] <kali> ron: i don't think we're hiring
[19:41:34] <ron> nobody likes me.
[19:41:45] <whaley> ron: bullshit.
[19:41:54] <ron> it's true.
[19:42:59] <ehershey> I'd complain that this is off topic
[19:43:04] <ehershey> but it's pretty amusing
[19:43:37] <ron> afaik, the channel doesn't have a strict on-topic rule. could be wrong. not that we do it that often.
[19:43:47] <ehershey> I could still complain
[19:44:03] <ron> nobody cares.
[19:44:05] <ehershey> unless there's a no-complaining-unless-rules-are-being-broken rule
[19:44:06] <whaley> ehershey: i've thought about turning ron into the freenode ops several times
[19:44:24] <whaley> ehershey: I think if both of us complain about him, we can get him k-lined
[19:44:47] <ehershey> I would never do that
[19:44:54] <ehershey> it would make him too happy
[19:45:26] <ron> nothing makes me happy. I'm like the grumpy cat.
[19:45:30] <ron> only less cute.
[20:26:50] <JoeyJoeJo> In my documents I have one field which is an array that holds latitude and longitude. I want to find all documents where the latitude is less than 0. How do I search for that?
[20:28:22] <Gargoyle> JoeyJoeJo: {doc.location.lat: {$lt: 0}} I think of the top of my head. not done any mongo since xmas.
[20:28:42] <Gargoyle> JoeyJoeJo: Assuming your lat is in doc.location.lat
[20:29:18] <JoeyJoeJo> thanks, I'll try that
[23:05:45] <coin3d> hello there. how should i model a many to many relationship within mongo, when the relation itself has an attribute?