pmxbot IRC Log Viewer

[00:44:55] <sinclair-linux> hey everyone

[00:45:02] <sinclair-linux> i have a quick question

[00:45:20] <sinclair-linux> where is the most appropriate place to install a mongo database on linux?

[00:45:28] <sinclair-linux> as in what directory is the most common?

[00:45:35] <sinclair-linux> or, does it not matter?

[00:47:10] <rossdm> you can put it wherever you want, default installs usually go to /data/db

[00:48:22] <sinclair-linux> rossdm: as in, the root ?

[00:48:43] <sinclair-linux> rossdm: for example, i have a site running /var/www

[00:49:09] <sinclair-linux> so, i would have the /data directory next to the /var directory in the root right?

[14:21:32] <lacrymology> how do I do a nested-structure search? Like if my documents look like [{ foo: 'foo', bar: { x: 1, y: 2} }, {foo: 'baz', bar: { x:2, y: 2}}, ...], how do I look for the object that foo.bar.x == 1?

[14:22:31] <Baribal> coll.find({foo.bar.x: 1})

[14:22:45] <Baribal> Er, wait.

[14:22:59] <Baribal> Er, wait.

[14:23:28] <lacrymology> Baribal: sorry, it's even worse. bar is a list, and what I want is bar, not the whole document (but I can handle that afterwards, of course)

[14:23:33] <Baribal> coll.find({bar.x: 1}), as bar is not a nested doc in foo.

[14:24:47] <lacrymology> Baribal: like { models: [{ x:1, y:2},{x:2, y:2}, {x:3, y:3}]}

[14:25:23] <kali> try find({"bar.x": 1}, { "bar.$" : 1}), if you use 2.4 or later

[14:25:26] <Baribal> Ah, okay... I'm not sure...

[14:25:45] <lacrymology> I'll check those, thank you

[16:00:03] <fredix> hi

[16:00:26] <fredix> It seems that mongo::fromjson doesn't work anymore

[16:13:25] <fredix> mongo::fromjson failed with int

[16:27:48] <fredix> ouch I find the issue, from::json need a white space before comma if the value is a int -> { "A": 1 , "B": 2 , "C": 3 }

[16:52:49] <kali> waw, that's a robust parser

[17:43:24] <fabio> hello, im trying a seq scan in a 11 Million row table

[17:43:32] <fabio> in postgresql vs mongodb

[17:43:47] <fabio> I thought mongo would won

[17:43:53] <fabio> but it doesn't

[17:44:06] <fabio> how can I scan 11M row faster

[17:44:09] <fabio> by sharding?

[17:44:36] <kali> fabio: mongodb is designed mostly for low latency small queries, not for high bandwidth scanning

[17:45:44] <fabio> kali, im trying to mapreduce an aggregation of this 11M row, in postgresql I have a ps/sql function which costs 4sec

[17:46:01] <fabio> I thought I could reduce that number with mongo

[17:46:18] <fabio> is not a good idea?

[17:46:18] <kali> honestly, i don't think you will

[17:46:22] <kali> no.

[17:46:51] <fabio> then why people is migrating to mongodb to use with metrics and large amount of data

[17:47:13] <kali> large amount of data is not a problem, scanning is

[17:47:33] <fabio> I mean it only works for low latency tiny queries?

[17:47:44] <kali> that's what mongodb is best at

[17:47:52] <fabio> whith no joins and almost few functionality?

[17:48:22] <kali> no joins, yeah, definitely.

[17:48:55] <fabio> why mongodb sells the wonderful mapreduce in its doc

[17:49:12] <fabio> if it is better to do with functions in a relational database

[17:49:13] <richthegeek> I'm writing a system which runs as a daemon and which should init a process when a row is inserted/updated in a collection. The collection will have at most (and usually far fewer) 1000 rows. Is it more performant to poll the collection for new rows every N seconds, or to cap the collection and use a tailable cursor?

[17:49:13] <fabio> ?

[17:49:48] <richthegeek> fabio: if the core of your app is joining (relating) data, then a relational database is what you need

[17:49:51] <kali> fabio: i don't know

[17:50:04] <kali> fabio: but i know that for high latency procesing, nothing beats hadoop

[17:50:21] <kali> fabio: and certainly not mongodb, who is optimised for exactly the opposite

[17:50:26] <richthegeek> fabio: however, for speed, lack of schema (see speed, has other uses though), then a document store like Mongo is best

[17:51:23] <fabio> I still dont understand in which situation mongodb wins

[17:51:44] <richthegeek> the situation in which you need speed over relational capabilities

[17:51:45] <fabio> I thougt it won in bigdata

[17:51:51] <fabio> I need speed

[17:52:09] <fabio> but no in aggregate data

[17:52:21] <fabio> only works in select one document

[17:52:33] <fabio> ?

[17:52:52] <fabio> who can design a database for that case?

[17:53:30] <richthegeek> I'm not sure what you mean by "but no in aggregate data, only works in select one document", can you rephrase?

[17:53:41] <kali> fabio: high load of interactive queries

[17:53:52] <fabio> I need to aggregate data from 11M row

[17:53:58] <kali> fabio: that is, about any read/write web site

[17:54:14] <fabio> i thought in sharding to improve performance

[17:54:20] <fabio> and do the mapreduce

[17:54:24] <fabio> faster than postgresql

[17:54:34] <fabio> but you say is not a good idea

[17:55:03] <fabio> in one server, seq scan drop twice worse than postgresql

[17:57:12] <kali> fabio: anyway, you'd better consider the aggregation framework instead of map/reduce

[17:57:43] <kali> fabio: and yes, sharding should help

[17:59:18] <fabio> yea kali just read here http://stackoverflow.com/questions/12678631/map-reduce-performance-in-mongodb-2-2

[17:59:59] <fabio> but I think is a time-wasting because a seq scan is twice worse than postgresql

[18:02:57] <kali> you plan to run thees scans / aggregations interactively ?

[18:06:28] <fabio> yes

[18:06:39] <fabio> a user now can do a select from a function

[18:06:48] <fabio> wait 4 seconds and have the results

[18:07:17] <fabio> I want to use mongodb an its aggregation framework thinking it would improve that time

[18:07:52] <kali> mmmm

[18:08:38] <kali> if you shard a lot, maybe you'll get something better... it will cost you in hardware

[18:09:00] <kali> can't you pre-aggregate some of the data beforehand ?

[18:09:45] <fabio> nope

[18:09:56] <fabio> what is sharding a lot

[18:10:03] <fabio> more than 4 servers?

[18:11:00] <kali> well, in the theoric case, cutting the dataset in two parts might cut the time in half

[18:11:41] <kali> so you would need to shards to match psql perf, four to divide the time in two, etc

[18:12:45] <kali> *two*, no to

[18:14:57] <fabio> then would be nice

[18:15:23] <fabio> thats what i thoght , by sharding i would reduce the time

[21:16:28] <preinhei_> I'm having some trouble finding my data. Which is a pity.

[21:16:46] <preinhei_> I've got four mongo servers participating in a replica set

[21:17:14] <preinhei_> i'm doing work near one of the slaves, and not all the work I'm writing is available later

[21:17:35] <preinhei_> http://pastie.org/private/0famdfkurqfxpfqsy8kza (using the PHP driver)

[21:19:07] <preinhei_> I guess a key note would be, I'm writing three things under results.Chicago, http (the documented one there), trace (shows in results) and dig

[21:21:18] <preinhei_> I've got a copy of the application running on the same server as the primary, no problems there

[21:22:04] <preinhei_> setting write concern (either to 2, or majority) seems to have made things worse, not better.

[21:23:07] <kali> preinhei_: you're aware you need your php client to be able to talk to the primary ?

[21:23:21] <preinhei_> yes

[21:23:44] <kali> preinhei_: you get some kind of error with some write concern ?

[21:23:56] <preinhei_> the other data elements (like dig and trace) are being written by the same script, on the same server

[21:24:45] <preinhei_> I've not found any error, but when I'm not modifying write concern I'd say 2/3rds+ of the data is available

[21:24:54] <preinhei_> now that I've set it, none of it shows up

[21:27:22] <preinhei_> wrong button, sorry

[22:09:18] <Almindor> anyone here using mongodb on ubuntu via upstart along with some depending service (e.g. node or some other server using mongo)?

Log file Viewer

Help | Karma | Search:

#mongodb logs for Saturday the 20th of April, 2013