PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Saturday the 16th of June, 2012

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:06:36] <mitsuhiko> mkmkmk: hah
[00:06:45] <mitsuhiko> scar
[00:06:47] <mitsuhiko> *scary
[00:12:33] <mkmkmk> scary indeed
[00:13:07] <mkmkmk> i have another box up that had mongos running before the 3rd config died & was brought back… it's operating fine
[00:15:29] <mitsuhiko> mkmkmk: is this compiled by hand?
[00:17:41] <mitsuhiko> mkmkmk: do all of the mongo config thingies run the same binary?
[00:23:15] <mkmkmk> it's the linux x64 binaries
[00:23:17] <mkmkmk> from the site
[00:23:44] <mkmkmk> config 1 and 2 run 2.0.2 and 3 runs 2.0.6
[00:23:49] <mkmkmk> i didnt update 1 and 2 yet
[00:24:06] <mkmkmk> kinda scared to, to be honest
[00:24:25] <mkmkmk> if mongos dies on another box and wont come back, i'm going to be dead in the water
[01:54:13] <Kryten001> Hi, here http://www.mongodb.org/display/DOCS/Object+IDs they say that an _id must be unique for a given db cluster, does that mean that two objects in the same db but different collection can not have the same _id ?
[04:34:07] <dstorrs> I have a collection that stores statistics. When I add new stats, I would like them to include incremental data (current stat - previous value). Is a mapReduce the best option here, do you think?
[04:34:24] <dstorrs> There's about 10M records per harvest
[04:34:54] <dstorrs> Doing it one by one is certainly out, but is there an alternative to M/R that I haven't considered?
[05:46:25] <oktapodi> hey
[05:46:44] <oktapodi> im designing a database in mongo for the first time
[05:47:26] <oktapodi> i have a model called "Band" as in music band.. it has "Albums" and "Tours"
[05:48:01] <oktapodi> how should i create the relation ship? should i embed both in "Band" or should i create a different relationship for each?
[05:48:09] <oktapodi> like we do in RDBMS
[05:48:24] <oktapodi> im really confused could someone help me out?
[06:11:30] <dstorrs> oktapodi: it depends on how many of each you'll have
[06:11:49] <oktapodi> lets say on avg
[06:11:58] <oktapodi> each band has 5 albums
[06:12:00] <dstorrs> you've got two options -- put each entity in a different collection and do the "joins" in app code
[06:12:04] <dstorrs> or embed them.
[06:12:10] <oktapodi> and the tours would be a lot more
[06:12:19] <dstorrs> what's the max number of albums you might expect?
[06:12:24] <oktapodi> 5
[06:12:35] <dstorrs> you said that was average. what's the max?
[06:12:43] <dstorrs> say 100 for the degenerate case?
[06:12:44] <oktapodi> ok around 15
[06:12:51] <dstorrs> ok, still tiny. good
[06:13:22] <dstorrs> so embed the albums
[06:13:47] <oktapodi> ok awesome... and the tours id have to do joins right?
[06:14:06] <oktapodi> because tours would be a lot
[06:14:28] <dstorrs> define "a lot" ... ?
[06:14:40] <oktapodi> 500
[06:15:01] <dstorrs> how big is each 'Tour' object ?
[06:15:15] <oktapodi> it has a start date and an end date and a location
[06:15:15] <dstorrs> you might still be able to get away with embedding
[06:15:20] <oktapodi> thats all
[06:15:23] <dstorrs> that's it?
[06:15:26] <oktapodi> ya
[06:15:30] <dstorrs> oh hell yes, embed that
[06:15:37] <oktapodi> wow briliant
[06:15:50] <oktapodi> i didnt knw we could do that in mongo
[06:16:30] <oktapodi> wow thanks alot dstorrs
[06:17:20] <heoa> What was the MongoDB -community-blog where blog-posts were fetced from the git repo?
[06:50:54] <dstorrs> oktapodi: you're welcome. (pardon long ping, I'm still working and a bit distracted)
[09:45:59] <mukeshagarwal> hey
[09:46:21] <mukeshagarwal> struggling to connect my mongodb driver to my php
[09:46:28] <mukeshagarwal> m using mac osx lion
[09:46:36] <mukeshagarwal> can anyone help?
[11:07:07] <Kage`> Anyone know of any PHP+MongoDB forums systems?
[12:57:02] <Kryten001> Hi, How do I interrupt a running mapReduce job ?
[13:34:11] <mitsuhiko> does anyone know if i can make a mongos just die on error instead of trying to recover itself?
[13:34:33] <mitsuhiko> if a mongos comes up before a replica set then the whole thing just zombies out and never recovers by itself
[14:04:41] <Defusal> hi everyone
[14:04:42] <Defusal> http://www.zopyx.de/blog/goodbye-mongodb
[14:04:50] <Defusal> any comments?
[14:07:04] <mitsuhiko> Defusal: what do you expect?
[14:07:08] <mitsuhiko> disagreement?
[14:07:48] <Defusal> im not here to expect, but rather to observe
[14:08:02] <Defusal> i'd like to see what the mongo community thinks of the points made there
[14:08:20] <Defusal> this is the first time i have read that, but i can see where he is coming from
[14:08:23] <mitsuhiko> Defusal: none of the things on there are wrong
[14:08:33] <Defusal> exactly what i thought
[14:08:38] <mitsuhiko> mongodb is currently still early adopter thing
[14:08:41] <mitsuhiko> take it for what it is
[14:09:02] <mitsuhiko> it's neither reliable nor easy to use nor well designed. but compared to many other nonsql solutions it has some merits
[14:09:03] <Defusal> which brings me to the point where i need to start deciding whether continuing to invest in mongo for projects is worth it in the long run
[14:09:09] <Defusal> or if i should start comparing alternatives
[14:09:15] <mitsuhiko> Defusal: postgres
[14:09:24] <mitsuhiko> unless you have a really good reason to not use it, use it
[14:09:24] <Defusal> NOSQL alternatives
[14:09:27] <mitsuhiko> Defusal: why
[14:09:37] <Defusal> and schemaless
[14:09:44] <mitsuhiko> Defusal: trust me, you don't want schemaless
[14:09:55] <mitsuhiko> the first thing we did was putting a schema on top of mongodb
[14:09:55] <Defusal> because those are design decisions that have been made for these projects
[14:10:04] <mitsuhiko> schemaless is a myth anyways because of indexes and shard keys
[14:10:15] <mitsuhiko> Defusal: why do you want schemaless?
[14:10:24] <Defusal> obviously you define a schema, but not database-side
[14:10:42] <mitsuhiko> what's wrong in having the schema in the database?
[14:11:13] <Defusal> i want the advantages of a non-linear schema, although embedded documents arnt always great, they work well in certain cases
[14:11:34] <mitsuhiko> schemaless is dangerous
[14:11:39] <Defusal> and having to constantly update the database schema like i do with projects using SQL databases, is just a pain in the ass
[14:11:51] <Defusal> especially during a prototype
[14:12:00] <Defusal> lots of things are "dangerous"
[14:12:03] <mitsuhiko> i think you're using the wrong tools then
[14:12:30] <Defusal> it should be up to the developer to take care, not the database to force unnecessary action
[14:12:34] <mitsuhiko> Defusal: nosql scales, but convenience is something else
[14:12:43] <mitsuhiko> Defusal: schemas are not unnecessary
[14:12:57] <mitsuhiko> Defusal: 80% of our data is keys ...
[14:12:58] <Defusal> anyway, i did not come here for this debate
[14:13:05] <mitsuhiko> i wish mongodb had schemas
[14:13:07] <Defusal> i just asked about the points made in that post
[14:13:20] <Defusal> i agree with that disadvantage at least
[14:13:43] <Defusal> in the end, it depends on what you are storing
[14:14:00] <mitsuhiko> Defusal: if you want a nosql solution mongodb is not the worst in the world
[14:14:02] <Defusal> i have been using mongodb for years
[14:14:12] <mitsuhiko> years :)
[14:15:16] <Defusal> while there are some downsides to the query interface, for the most part, the schemaless nature does streamline development, especially for prototypes
[14:15:29] <Defusal> and all my projects at this point in time start as prototypes
[14:15:43] <Defusal> so any steamlining is more than welcome
[14:16:19] <Defusal> i cringe when i have to make a single change to a project using a SQL database just because of the effort it takes to update the schema
[14:16:33] <Defusal> where as a mongo project the change is made in seconds
[14:17:01] <mitsuhiko> Defusal: the problem with mongodb is not mongodb itself as much as the fact that everybody thinks he needs to use it
[14:17:16] <Defusal> i see
[14:17:18] <mitsuhiko> mongodb just does not work unless you have replica sets and config servers
[14:17:26] <mitsuhiko> most people that complained are not running it in that setup
[14:17:34] <Defusal> it doesn't?
[14:17:48] <mitsuhiko> unless you want to lose data that is
[14:17:51] <Defusal> i have never used it personally on a large enough scale for multiple servers
[14:18:10] <mitsuhiko> most of the problems that people complain about are happening with zero or one config server
[14:18:17] <Defusal> although a platform i designed when i was contracted out to do so has grown to such a point since i left
[14:19:14] <Defusal> well, is there an article somewhere pointing out the right setup if you want to avoid data loss?
[14:19:21] <Defusal> for small scale setups that is
[14:19:46] <mitsuhiko> don't use mongodb for small scale setups :P
[14:19:49] <Defusal> i havn't noticed any issues with more recent versions personally
[14:19:56] <mitsuhiko> unless you never ever want it to scale up
[14:20:09] <Defusal> err?
[14:20:20] <mitsuhiko> even the official docs pretty much tell going from non-sharding to sharding is painful
[14:20:43] <Defusal> very small setups (community projects) work fine, and would not need to be scaled up much
[14:21:01] <Defusal> some of my other projects like my current one will definitely be scaled up in the future
[14:21:22] <Defusal> majorly eventually, but i probably will seperate setups in different countries
[14:21:41] <Defusal> my point is
[14:22:12] <Defusal> if a single server works fine as is, is the threat of data loss really large enough for it to be worth a sharded setup
[14:22:23] <Defusal> even before such a large scale is needed
[14:22:25] <mitsuhiko> shards don't help against data loss
[14:22:27] <mitsuhiko> replica sets do
[14:22:36] <Defusal> well replica sets then
[14:22:45] <Defusal> you obviously need a replica on a seperate machine
[14:23:06] <Defusal> point is, is it really necessary?
[14:24:23] <Defusal> a replica server would obviously be better protection than backups every few hours, sure, but how often is there gonna really be a data loss issue that will require recovery?
[14:24:29] <mitsuhiko> Defusal: if you use mongodb you need replica sets
[14:24:37] <mitsuhiko> it's one of the constraints of the design. that's how you run it
[14:25:04] <Defusal> then you could run a replica set on the same machine
[14:25:24] <Defusal> that would be protection against internal mongo corruption
[14:25:25] <kali> not if you're ready to loose all writes between two backups
[14:26:05] <Defusal> kali, you mean in the case of hardware failure?
[14:26:08] <mitsuhiko> Defusal: mongodb is just fine for what it is. but you have to take it with all it's warts and bugs :)
[14:26:20] <mitsuhiko> i think the argument for mongodb is that 10gen is very, very active and cares abuot improving their product
[14:26:36] <Defusal> mitsuhiko, well then again, please point me to an article that provides details on this
[14:26:44] <mitsuhiko> Defusal: what is "this"?
[14:26:52] <Defusal> because i have ever heard of mongodb requiring a replica server for use before
[14:26:58] <Defusal> and i've been using it for years
[14:27:16] <Defusal> a detailed article on what is required for a stable, reliable setup
[14:27:16] <mitsuhiko> Defusal: why do you use mongodb without reading up on the design?
[14:27:23] <mitsuhiko> especially if you use it for years
[14:27:32] <Defusal> i know all about the design
[14:27:39] <kali> Defusal: you're right, that was actually the point of having journaling
[14:27:39] <Defusal> i know about sharding, replica servers, etc
[14:27:41] <mitsuhiko> then why do you ask why you need a replica set ...
[14:27:56] <Defusal> but as i said, you are the first one to ever tell me a replica set is *required*
[14:28:00] <mitsuhiko> until recently you absolutely needed replica sets. now it's more relaxed
[14:28:18] <Defusal> i have never seen anyone, on irc, or an online article, say that is a *requirement*
[14:28:29] <kali> mitsuhiko: it's never been a requirement, come on :)
[14:28:51] <kali> mitsuhiko: it all depends on how precious is the data
[14:29:12] <mitsuhiko> kali: no, but it was a very good idea ;)
[14:29:12] <Defusal> kali, that is what i thoughtr
[14:29:12] <Defusal> i had lots of data loss issues with early mongodb versions
[14:29:12] <Defusal> havn't noticed anything since about v1
[14:29:38] <Defusal> but now that mitsuhiko says this, i would like to know what the real world chance of data loss without a replica server is
[14:29:45] <mitsuhiko> i'm just saying: mongodb is not the problem, wrong application is.
[14:29:59] <Defusal> you've said a bit more than that ;)
[14:30:12] <mitsuhiko> those were just logical conclusions
[14:30:52] <Defusal> kali, so according to you at least, you're pretty safe without replicas
[14:31:06] <kali> Defusal: depends on your data :)
[14:31:12] <kali> Defusal: i would not do it with mine :)
[14:31:24] <kali> Defusal: except for debug log or that kind of crap, maybe
[14:31:24] <Defusal> heh ok
[14:32:04] <kali> Defusal: i mean, you're alwaus at the mercy of a catastrophic hardware failure, and these are not that rare
[14:32:25] <kali> Defusal: so if you are willing to risk having to start up from the last backup, yeah, why not
[14:32:46] <Defusal> ok
[14:33:14] <Defusal> my current main server has 2 x 1TB hdds in raid incase one fails
[14:33:44] <kali> and if the controller fails and starts making holes in both disks ? :)
[14:33:47] <Defusal> but it is quite possible to lose everything in a raid setup anyway
[14:33:52] <Defusal> sure :)
[14:34:01] <Defusal> better than nothing still
[14:34:15] <kali> or you OS get mad, or an admin make a mistake...
[14:34:16] <kali> yes
[14:34:23] <Defusal> yeah
[14:34:30] <Defusal> s/an admin/Defusal
[14:34:31] <kali> you protect yourself against disk failure. and that's all.
[14:35:00] <Defusal> why is adding a replica set later so much effort?
[14:35:04] <mitsuhiko> i just wish the mongos would be more reliable :(
[14:35:09] <kali> Defusal: it's easy
[14:35:17] <Defusal> mitsuhiko said it was terrible to add later
[14:35:21] <mitsuhiko> Defusal: i said sharding was
[14:35:23] <kali> sharding is harder
[14:35:25] <Defusal> ah right
[14:35:29] <kali> replication is realy easy
[14:35:30] <Defusal> why is that hard?
[14:35:54] <mitsuhiko> Defusal: because you need to introduce a mongo controller, mongo routers and the balancer will spend a long time moving your data over :)
[14:36:06] <Defusal> right ok
[14:36:11] <kali> and there a some rare semantics alterations
[14:36:20] <Defusal> well either way, i'd rather do that when i scale up
[14:36:31] <kali> group, for instance, does not work in shards
[14:36:35] <Defusal> theres going to be tons of time consuming changes needed a long the way
[14:36:49] <Defusal> ah that does suck
[14:37:31] <Defusal> well, i guess it may very well be easier to create more seperation when scaling up
[14:37:45] <Defusal> rather than sharding
[14:38:08] <Defusal> but at the same time, adding replica sets at that point will be a good idewa
[14:38:11] <Defusal> idea*
[14:38:20] <Defusal> ok thanks for the advice
[14:38:53] <mitsuhiko> anyone any ideas how to make a mongos die instead of retry itself to find the config server?
[14:39:10] <mitsuhiko> if a replica set is entirely down and the mongos is up it will never recover from it :(
[16:03:06] <mitsuhiko> how do people manage their mongos?
[16:03:15] <mitsuhiko> just hope that a replica set is never down when a mongos comes up?
[17:36:17] <multi_io> can I load/run JS files in the mongo shell?
[18:20:49] <AlecTaylor> hi
[18:21:08] <AlecTaylor> Is Varnish an alternative to MongoDB as a caching server?
[18:24:13] <skot> not really related as varnish is an http (mostly) accelerator and mongodb is a document database.
[18:24:51] <AlecTaylor> But does Varnish use a db? - I'm sure it would... can it be modified to use Mongo?
[18:25:29] <skot> It uses in-memory/persistent data structures but also use memory mapped files for those
[18:25:49] <skot> I don't know of any mongodb storage option
[18:26:21] <AlecTaylor> Hmm, kk
[18:26:22] <AlecTaylor> thx
[18:26:35] <skot> or any db for that matter: http://en.wikipedia.org/wiki/Varnish_(software)
[18:30:30] <kali> nope varnish accepts modules for request processing, but it has only two builtin storages: one mmap base, the other malloc based
[18:30:59] <kali> and i don't think the storage is modularized
[18:33:00] <skot> cool, thanks for the info, do you happen to know how many use the malloc based version; I thought the mmap one was the default.
[18:33:24] <kali> mmap is the default
[18:33:39] <kali> i think most people uses it
[18:33:44] <kali> i do :)
[18:33:59] <skot> seems like the reasonable default as long as you have a dedicated machine
[18:34:24] <kali> yes
[18:35:07] <skot> with what I've seen with shared server running mongodb, and the mmap os behavior, I can imagine it would not go well using that model with varnish
[18:36:07] <kali> well, it make sense to have something like a haproxy or an nginx, as they are not memory greedy
[18:36:21] <kali> but you don't want to share a VM between varnish and mongodb :)
[18:53:52] <jpnance> this is a really pedestrian question
[18:54:28] <jpnance> let's say i have a collection of documents such as: { "field1": 10, "field2": 5 }
[18:54:45] <jpnance> what's the typical way to query that collection for all documents which have field1 > field2?
[18:54:53] <wereHamster> $where
[18:54:59] <Derick> yes, or precalculate
[18:55:18] <wereHamster> which is slow, because it can't use an index. If you want to use an index, make a third field, field1minusfield2, and index that. and then run a query on that
[18:55:38] <jpnance> wow, i somehow have completely missed that $where is a thing
[18:55:54] <jpnance> yeah, both of those make sense
[18:56:00] <jpnance> thanks
[19:01:04] <jpnance> what if i'm testing for equality of those fields?
[19:01:21] <jpnance> {$where: "this.field1 = this.field2"} doesn't seem to work
[19:01:52] <jpnance> whoops, i guess == is what i'm looking for
[20:28:41] <multi_io> one can't have multiple levels of secondaries, right?
[20:30:07] <wereHamster> can you explain?
[20:32:58] <multi_io> like, only use some slaves for query X, and some other slaves for query Y
[20:33:16] <wereHamster> slave as in a replica set slave?
[20:33:26] <multi_io> yes
[20:33:31] <multi_io> not sharding
[20:33:34] <multi_io> *no
[20:33:37] <wereHamster> no, mongodb doesn't work that way
[20:40:54] <kaikaikai> hi
[20:43:45] <kaikaikai> i'm running into this problem using php drivers/library: http://jsfiddle.net/haSYw/2/
[20:43:59] <kaikaikai> the error is "MongoCollection::update() expects at most 3 parameters, 4 given"
[20:44:25] <kaikaikai> but to me, the update command i'm running seems well formed according to docs http://www.mongodb.org/display/DOCS/Updating
[20:45:12] <kaikaikai> wonder if anyone could give a second look?
[20:46:04] <kaikaikai> i mean, docs specify four arguments in the introduction, so how could it expect at most three?
[20:49:53] <skot> the last two arguments should be in the same array
[21:14:01] <toothr> kaikaikai, you're looking at the JS docs, instead of the PHP driver docs
[21:14:18] <toothr> kaikaikai, http://us.php.net/manual/en/mongocollection.update.php
[21:14:52] <kaikaikai> ahhh, thanks skot and toothr, thought the mongo docs were a general outline for all drivers
[21:15:01] <kaikaikai> certainly noted
[21:16:41] <toothr> iirc there are actually links to the related driver docs for the popular langauges on each doc page