PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 5th of July, 2012

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[07:25:32] <nrw> is this the place to ask a question about node-mongo-native?
[07:30:39] <[AD]Turbo> hola
[08:51:43] <coalado> Hi there.
[08:52:17] <coalado> I wonder if it is possible to have kind of "Views" in Mongodb. for example a Self Updating collection based on a map/reduce command
[08:56:38] <ron> afaik, no.
[08:56:49] <ron> plus, map/reduce isn't really a 'light' operation.
[08:57:46] <coalado> right...
[09:06:51] <superMustafa67_> hello all
[09:07:14] <superMustafa67_> I am here to ask about the best C library client library for mongodb
[09:07:20] <superMustafa67_> fast
[09:07:31] <superMustafa67_> and maybe support compression at fly
[09:09:50] <superMustafa67_> Is the batch insertion support compression ?
[09:10:15] <superMustafa67_> because I have more than terabytes to send each day
[09:11:48] <NodeX> there is no compression at present
[09:11:51] <NodeX> it's in the roadmap
[09:12:49] <superMustafa67_> NodeX: thanks for your answer
[09:13:13] <superMustafa67_> Another question: I have 2 kinds of data source
[09:13:21] <superMustafa67_> One is ticket from logger
[09:13:26] <coalado> Does anybody use MongoVue? There should be a Save/Open Option to save and open map/reduce queries. But the button is missing in my version somehow.
[09:13:45] <superMustafa67_> And another is tunnel from another machine
[09:14:08] <superMustafa67_> I need to make a resolution between ticket specific entry and the tunnel specific entry
[09:14:27] <NodeX> resolution?
[09:14:53] <superMustafa67_> yes, for example : One ticker is : <ip> <request> ...
[09:15:02] <NodeX> https://jira.mongodb.org/browse/SERVER-164?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel <--- you may want to watch that for compression
[09:15:24] <superMustafa67_> And one tunnel instance is : <number><ip> ..
[09:15:40] <superMustafa67_> and at insertion , I want to find the tunnel reference for the ticket
[09:15:45] <superMustafa67_> and merge to one entry
[09:16:03] <NodeX> the best thing to do in that case is an upsert on a familiar field
[09:16:25] <NodeX> IP seems to be familiar
[09:17:23] <superMustafa67_> NodeX: thanks for the link and your answer
[11:04:05] <Galactica> heya, how i can use a $set and $push modifiers in the one update query ?
[11:07:46] <Galactica> -_-
[11:08:12] <Galactica> can i do it or not ?
[11:39:13] <mids> Galactica: db.x.update(query, {$set: {x: 1}})
[11:43:04] <Galactica> mids, i want db.x.update(query, {$set: {x: 1}, $push: {something: {}}})
[11:43:22] <mids> sure that is fine too
[11:44:31] <Galactica> i have not worked T_T
[11:47:02] <mids> can you pastebin some example code?
[11:51:48] <Galactica> http://pastie.org/private/qvfwk0pkkm75l8dflzbva
[11:56:46] <mids> Galactica: http://pastie.org/private/ulr07kiyrbofos1zhjdsw
[11:56:48] <mids> works for me
[11:57:40] <mids> would that be the result you expect as well?
[11:58:40] <Galactica> yes
[11:59:04] <superMustafa67_> there is people that experimented mongodb with billions of entries in collection + mapreduce operations ?
[11:59:09] <Galactica> so strange, maybe the case in the old driver
[12:00:44] <superMustafa67_> for example I have a collection of 8 billions entries
[12:01:24] <superMustafa67_> and can't imagine if mongodb could do the job of sorting , merging these entries
[12:01:59] <mids> I wouldnt use mongodb mapreduce for serious size datasets
[12:02:47] <superMustafa67_> mids: do you have an alternative solution ?
[12:03:13] <NodeX> hadoop
[12:03:35] <NodeX> superMustafa67_ : is your data sharded?
[12:03:42] <mids> instead look at the mongodb-hadoop adaptor; or the upcoming aggregation framework for mongo
[12:03:55] <superMustafa67_> NodeX: could be sharded yes, I have several machines
[12:04:47] <NodeX> you -could- achieve it with native mongo map/reduce on sharded but it's not advisable as mids says
[12:05:55] <superMustafa67_> NodeX: thanks again for your advice
[12:52:10] <PDani> hi
[12:56:17] <PDani> i would like to write to mongodb asynchronously, and for every write operation i'd like to send a getlasterror asynchronously, and read the results, if any received, pair the results with my requests by request id, and asynchronously decide which writes succeeded and which didn't. is it possible?
[12:57:09] <algernon> if you use a thread for each write, then, as far as I remember, yes.
[12:57:40] <PDani> is it possible without threading using select()?
[12:58:07] <algernon> if you make sure not to send another write before you have the result of the previous getlasterror, then yes
[12:58:30] <algernon> ie, you'll need a queue
[12:59:55] <PDani> that's a problem, because right now i have two bottlenecks: context switches (threading is enemy), and network roundtrip time (i can't wait for the previous write's result)
[13:01:29] <algernon> well, you could use a thread pool of writers
[13:01:48] <algernon> where each writer would do one write/getlasterror at a time, but you could do as many writes as threads you have.
[13:02:31] <algernon> better than a thread/write, and perhaps allows a bit more throughtput than a single thread that always has to wait
[13:02:55] <PDani> ok, but i still have many context switches in the client, that's why i should avoid threading
[13:04:03] <algernon> well, you either wait for results, always, or not, and pay the price for threading. (or find another way to check whether a write succeeded)
[13:04:52] <algernon> ie, if you don't need the error message, and only want to check if the write arrived, you could query it for later
[13:06:03] <algernon> erm, query it later. eg, insert({_id: "foo", ...}) and later find({_id: "foo"}, {_id: 1}) - if the find returns something, the insert hit the db, and the only requirement is that the find happens later than then insert
[13:06:04] <PDani> how can i query a specific write?
[13:06:34] <PDani> hm
[13:06:37] <algernon> you query the id, and see if it exists. not perfect, and it has its downsides too, but perhaps an option.
[13:06:40] <PDani> that's not bad :)
[13:10:13] <PDani> another thing came to my mind: what if i have a connection pool in client? Every connection has a state form (free, waiting_for_getlasterror), and when i have to write, I choose a free connection, send a request and a getlasterror command, put the connection in waiting_for_getlasterror state, and when i'm out of free connection, i try to read some responses from connections
[13:10:40] <PDani> and it can be accomplished in one process
[13:10:53] <algernon> that's what I meant with the thread pool
[13:11:16] <algernon> but connection pool works just aswell, yes
[13:11:18] <PDani> the "thread" word misguided me, because i'd implement this with select()
[13:11:31] <PDani> but thanks, it's a good idea
[13:37:59] <remonvv> you need both
[13:38:48] <remonvv> connection pool isn't an alternative to a thread pool. If you have the option to have a thread pool (or, multiple threads in general) you should use it for things like this.
[13:48:33] <remonvv> PDani, which language are we talking about here? You shouldn't need much context switches at all for a MongoDB driver.
[13:49:09] <PDani> remonvv, c, and mongo-c-driver
[13:49:30] <remonvv> WHy is your context switching a bottleneck you think?
[13:49:46] <PDani> i have to keep cpu usage as low as possible
[13:50:08] <remonvv> Ofcourse, but there should typically be very few live threads at any point in time in a MongoDB driver.
[13:50:51] <PDani> i have to handle ~10000 write requests / sec
[13:51:21] <PDani> or more
[13:51:30] <remonvv> MongoDB or your driver? it should be able to do an order of magnitude more than that if you're just looking at driver CPU.
[13:51:39] <PDani> the client
[13:52:04] <remonvv> I'm confused. Are you writing a driver or using the c driver?
[13:52:33] <PDani> i'm using the mongo-c-driver right know
[13:53:01] <PDani> but of course i will have to implement the wired protocol myself i'd like to make asynchronous getlasterror
[13:53:12] <PDani> if*
[13:55:49] <remonvv> asynchronous on what level? It has to be invoked on the same connection and since you're not allowed to use that connection in between the driver parks the connection (i assume it does) until the selector for that channel has reads ready. That shouldn't result in any cpu load.
[13:57:59] <remonvv> On higher levels than that an asynchronous getLastError doesn't make much sense. The whole point of the GLE call is to prohibit asynchronous writes for w > 0 writes. If you want w <= 0 writes just skip the GLE.
[14:00:24] <PDani> remonvv, yeah, finally i decided to use threading. i thought i can implement something like writing-writing-writing, and sometimes reading acks for my writes, but mongodb obviously not designed for it. so i will use separate threads with separate connections to boost up things by the price of some context-switches
[14:07:36] <remonvv> well you can do that on a threading level but you need to park the busy connections
[14:07:40] <remonvv> the two things aren't that related
[14:08:08] <remonvv> a parked connection takes 0 cpu and a blocking thread takes very, very few
[14:08:31] <remonvv> your max throughput should not be bottlenecked by CPU, not even close.
[14:08:48] <remonvv> The only CPU intensive thing happening driver side is BSON serialization and some housekeeping.
[14:08:50] <remonvv> afk
[14:20:43] <PDani> remonvv, yes, you're right... i just had to think it through again :)
[14:23:20] <venom00ut> hi, when will mongodb support spidermonkey 1.8.5?
[14:26:09] <mids> venom00ut: probably never; I expect mongodb to move to v8 instead
[14:26:25] <mids> see comments on https://jira.mongodb.org/browse/SERVER-2887
[14:26:39] <venom00ut> mids, that's OK, but then, is v8 support stable?
[14:26:52] <venom00ut> gentoo offers the possibility to build with v8 support, but I'm not sure how stable it is
[14:27:13] <NodeX> stable enough for google
[14:27:18] <NodeX> stable enough for me !!
[14:27:33] <venom00ut> NodeX, even in 2.0.6?
[14:27:57] <NodeX> 2.0.6 of what ?
[14:28:08] <venom00ut> mongodb
[14:28:21] <NodeX> I dont know, I'm on 2.0.5, I havnt had the chance to update
[14:29:09] <venom00ut> I'll give a try to v8, I just hope it works, I'm not on production system
[15:29:48] <remonvv> At some point someone will have to explain to me why SM vs V8 is a hugely relevant issue for production systems ;)
[15:31:10] <Derick> SM is not reentrant so can't run more than one at the same time...
[15:37:05] <venom00ut> remonvv, just because I've been told that v8 support is experimental in mongodb
[15:43:48] <remonvv> Derick, still not that relevant for production systems though. V8 being re-entrant "fixes" JS concurrency somewhat but that's about it. Performance is still vastly inferior to native functionality and most functionality that currently requires JS and scales is replaced by the new AF. MongoDB should probably drop JS altogether.
[15:44:24] <Derick> not disagreeing there. I've always advocated to stay away from M/R or JS as much as you can
[15:50:28] <remonvv> Yeah exactly, it just doesn't scale very well.
[18:24:29] <e-dard> Hi. If I have a list of dictionaries in Mongo, and I'm searching for all documents where one of the dictionaries in the list has a certain value for one of its keys, how do I then unset the matching dictionaries, so that they are removed from their parent lists without the other dictionaries in the list being affected etc?
[18:24:50] <e-dard> Hmmm ^ let me know if the last message was over 512 and got cut off..
[18:40:43] <e-dard> Anyone?
[19:34:29] <hadees> so i'm trying to figure out what i'm doing wrong in this query db.request_end_events.find({"t": {$gte: new Date(2012, 6, 18), $lt: new Date(2012, 6, 19)} }) if say the document is { "t" : ISODate("2012-06-18T22:05:07Z") }
[19:36:39] <algernon> hadees: you're missing $and
[19:37:56] <hadees> algernon: i was looking at http://cookbook.mongodb.org/patterns/date_range/ and i didn't see an $and
[19:41:47] <algernon> d'oh
[19:41:55] <algernon> hadees: Date()'s month starts from 0
[19:42:01] <algernon> try with 5
[19:42:31] <hadees> algernon: that worked, thanks, seems kind of odd but whatever
[19:42:41] <algernon> js is stupid at times
[19:42:54] <tystr> hmm
[19:42:56] <tystr> uncaught exception: map reduce failed:{
[19:42:56] <tystr> "assertion" : "assertion db/commands/mr.cpp:400",
[19:42:56] <tystr> "errmsg" : "db assertion failure",
[19:42:56] <tystr> "ok" : 0
[19:42:56] <tystr> }
[19:43:27] <tystr> is there any way to get a more detailed error message?
[19:48:09] <tystr> hmm
[19:49:10] <tystr> seems to be erroring when I try to "group" by a field that doesn't exist in every document with emit()
[19:49:12] <tystr> hmm
[19:49:49] <tystr> i.e., i have emit(this.ts, …) but only some documents have that field.
[20:10:18] <tystr> is this the expected behavior?