pmxbot IRC Log Viewer

[01:39:19] <veesahni> Using ruby/mongoid, anybody know how to get bson serialized representation? I'm trying to stash an object to disk if i have mongo connectivity failure (to unserialize and save! in the future)

[01:54:29] <fbjork> is it possible to search arrays of arrays in mongodb?

[01:55:54] <ralphholzmann> If I want to upgrade my database from 1.8, can I simply mongodump, install the new version, and restore?

[01:56:00] <ralphholzmann> will mongodumps for 1.8 work in 2.2?

[02:11:31] <ralphholzmann> anyone know?

[02:43:06] <ralphholzmann> guess I should just try it

[03:16:46] <michaeltwofish> ralphholzmann: my experience is yes, mongodumps from 1.8 work in 2.2.

[03:17:07] <michaeltwofish> But you'll need to rebuild indexes.

[03:17:40] <ralphholzmann> michaeltwofish: sweet!

[03:17:41] <ralphholzmann> thanks

[03:18:15] <michaeltwofish> Pleasure. I'm happy and shocked that I'm able to answer a question here :)

[03:20:14] <michaeltwofish> I have a question that isn't mongo specific ... I'm trying to raise ulimit -n on a CentOS box, as mongo is refusing connections, but it's not sticking.

[03:20:34] <michaeltwofish> Is anyone willing and able to help me troubleshoot that?

[03:22:13] <svm_invictvs> This may seem like a dumb quesitons, but what functions are availble when I'm running a mapreduce job?

[03:22:21] <svm_invictvs> Am I able to fetch objects from the collection?

[03:53:23] <cmatta> Does anyone know if there's a way to just get the sec value of a PHP MongoDate from inside an aggregate function? So far all I've been able to get it to output is MongoDate objects

[03:53:42] <cmatta> this is using the PHP mongo module

[03:53:46] <cmatta> 1.30beta2

[06:45:50] <saby> Hi

[06:46:21] <saby> I've a aggregate query like this :-

[06:46:22] <saby> db.MyCollection.aggregate({ "$project" : { "_id" : 0 , "studyId" : 1 , "status" : 1 , "lag" : 1 , "lastChanceTime" : 1 , "startTime" : 1 , "priority" : 1 , "score" : 1 , "query" : 1 , "creationDt" : 1 , "Mode" : 1 , "updateDt" : 1 , "calScore" : { "$multiply" : [ { "$multiply" : [ { "$add" : [ { "$multiply" : [ { "$divide" : [ { "$subtract" : [ 1350366954224 , "$lastChanceTime"]} , 1350366954224]} , 0.2]} , { "$multiply" : [ { "$divide" : [ { "$subtract" :

[06:46:22] <saby> [ 1350366954224 , "$startTime"]} , 1350366954224]} , 0.1]} , { "$multiply" : [ { "$divide" : [ "$lag" , 100000]} , 0.7]}]} , { "$divide" : [ { "$subtract" : [ 10 , "$errorCount"]} , 10]}]} , db.eval(function() {if(this.priority = 6){return 6} else {return 1} })]}}},{ "$sort" : { "calScore" : -1}},{ "$limit" : 1});

[06:47:33] <saby> This query also contains a db.eval function(), is there a way to get the result of that function without using the db.eval method ?

[07:04:41] <wereHamster> pastebin. That wall of text is unreadable!

[07:10:52] <saby> wereHamster http://pastie.org/private/qv9hcaegluziffoikm5pw

[07:13:29] <saby> wereHamster this query executes fine when run in the shell, but when i create the same query using the java driver, the db.eval part gets enclosed in double quotes, hence making it a string and instead of executing it

[07:28:23] <NodeX> isn't java typed?

[07:52:29] <LouisT> does $or work with remove()?

[07:55:34] <[AD]Turbo> hola

[08:01:12] <NodeX> remove takes any query afaik

[08:19:20] <ZacS1234> hi

[08:41:12] <pgcd> hi all

[08:41:56] <pgcd> any django-mongodb users?

[08:43:34] <pgcd> failing that, I have a (possibly quick) question: if I have an embedded document, is there any way of retrieving the container document without a find()?

[08:44:39] <NodeX> embedded as in nested?

[08:44:59] <NodeX> foo:{bar:1,a:2....},c:1

[08:46:05] <NodeX> or do you mean you have stored the ID of another document in the document and want to do a join?

[08:50:16] <pgcd> NodeX: embedded as in nested

[08:50:52] <pgcd> same old {post: {comments: [a,b,c]}}

[08:51:03] <NodeX> I dont understand the probelm, when you query the WHOLE document is returned

[08:51:12] <pgcd> basically, i'd like to find the post the comment refers to

[08:51:24] <NodeX> unless you tell it that you only want certain fields

[08:52:10] <pgcd> NodeX: perhaps it's a django-mongodb quirk, but I can "act" on a specific reply to a post

[08:52:16] <NodeX> db.foo.find({'bar.user':'Barry'}); <--- the whole document is returned

[08:53:02] <NodeX> django-mongodb is pretty dumb if it acts like that

[08:53:26] <NodeX> being it's the exact opposite of how the shell and other drivers work

[08:53:44] <pgcd> to be honest, I cannot be sure - some basic introspection didn't show an evident way of accessing the document

[08:54:01] <NodeX> can you pastebin the document from the shell?

[08:54:03] <pgcd> but if it's like that, I suppose it's my fault

[08:54:29] <NodeX> it sounds like the document is linked not nested

[08:55:56] <pgcd> http://pastie.org/5066642

[08:56:19] <pgcd> (sorry for the extra long loremipsum)

[08:56:37] <pgcd> as you'll see, i tried adding a "topic" field

[08:58:40] <NodeX> ok and you want to get that whole document from querying db.foo.find({"replies.user.username":"Terry Carr"}); for example?

[08:59:01] <pgcd> that is one use case, yes

[08:59:26] <pgcd> another, which I'm having more difficulty with, is to be able to rate a single reply AND the topic it belongs to

[09:00:00] <NodeX> well by default it -should- return everything, do you have a link to the django mongo docs and I'll take a peek

[09:00:13] <NodeX> the latter is -fairly- easy too

[09:00:23] <pgcd> http://django-mongodb.org/

[09:00:36] <NodeX> you're after the dollar notation to achieve the rating

[09:00:50] <NodeX> dollar positional operator*

[09:01:24] <NodeX> http://www.mongodb.org/display/DOCS/Updating#Updating-The%24positionaloperator <----

[09:02:25] <pgcd> wow, that *is* different from relational dbs =)

[09:02:43] <NodeX> http://django-mongodb.org/topics/embedded-models.html <--- I assume you did something like this and made a model for your comments?

[09:02:58] <pgcd> NodeX: exactly

[09:03:13] <NodeX> yer, this sort of crap is what;s wrong with drivers to be honest

[09:03:48] <NodeX> it's very confusing for users and tries to be to much like RDBMS / MVC when an open schema should not be so restrictive

[09:03:50] <pgcd> then I realized that if I had two collections the data got saved twice, and I didn't understand how/if it became consistent

[09:04:13] <NodeX> the short answer is I dont know enough about python / django to advise on how to reverse it

[09:04:15] <pgcd> I'm starting to wrap my head around that

[09:04:47] <pgcd> i mean, the different approach

[09:05:01] <saby> I have a doubt regarding a query in the aggregation framework

[09:05:32] <NodeX> pgcd: I imagine it's something you would have to re-create as in a new model for this specific type of query

[09:05:35] <saby> i'm making a aggregation framework query, which takes values from each row and performs calculation on them and projects that value of each row

[09:05:49] <NodeX> I detest MVC coding in any language so I could be totaly off the mark

[09:06:02] <saby> is it possible, that i can have a check condition, so that if the value of a row is this, it multiplies the value with some x

[09:06:14] <NodeX> normaly some python users are around pretty soon, perhaps hang around till then?

[09:07:00] <pgcd> NodeX: thanks, I will hang around - i'm sure there's lots to be learned =) (and I'll have to really understand how/when to use map/reduce)

[09:07:07] <NodeX> saby: you can use a $match

[09:07:35] <NodeX> pgcd : avoid map/reduce if you can, I recommend aggregation framework if you can / have 2.2

[09:08:09] <saby> NodeX http://pastie.org/5066682

[09:08:10] <NodeX> some things do need M/R but quite alot can be done with aggregation framework and it's alot faster

[09:08:12] <saby> something like this

[09:08:19] <pgcd> NodeX: I'll have to read about that...

[09:08:49] <saby> <if the priority value is 6, return 6 else return 1>

[09:09:20] <saby> So value of each row is finally multiplied with either 6 or 1 depending a field's value

[09:10:16] <NodeX> $match can take a value or the value of another pipeline i/e the value of some key

[09:10:39] <saby> so i should use $match in my scenario ?

[09:10:52] <NodeX> I'm still reading that pipeline mess lol

[09:10:59] <saby> lol

[09:11:39] <NodeX> $match should work there

[09:12:05] <NodeX> what's the variable that's supposed to be matched?

[09:12:17] <saby> priority

[09:13:03] <NodeX> $priotity?

[09:13:04] <NodeX> http://docs.mongodb.org/manual/reference/aggregation/#_S_match

[09:13:21] <saby> yes NodeX

[09:13:54] <saby> but NodeX match is for filtering

[09:14:16] <NodeX> that's what you're doing, filtering out the "6's" and doing somehting with them

[09:14:37] <saby> but with other values, i'm multiplying with 1

[09:14:51] <saby> wouldn't match help me in returning on the results with value 6

[09:15:19] <saby> only*

[09:16:33] <saby> "calScore" : { "$multiply" ...

[09:17:01] <saby> the first $multiply is multiplying the result of all the calculations being performed with the return of the if condition

[09:17:20] <saby> so if $priority=6 multiply the score with 6 else multiply the score with 1

[09:17:27] <saby> kind of like boosting my results

[09:18:18] <ZacS1234> anyone know what you need to do to get the mongodb node.js driver/mongoose to auto reconnect?

[09:18:41] <NodeX> saby, not sure it can be done then

[09:18:51] <NodeX> I didn't read your question properly sorry

[09:20:20] <saby> NodeX so basically in short i have a variable called $score, if the $priority=6 then multiply $score with 6 else multiply $score with 1

[09:20:34] <saby> hmmm, there might be someway to do it

[09:21:04] <NodeX> not with aggregation framework unless you multiply by a preset number

[09:21:10] <NodeX> i/e a preset variable

[09:21:23] <NodeX> (something in the doc / pipeline)

[09:22:02] <saby> ok

[09:22:26] <NodeX> "Projections can also add computed fields to the document stream passing through the pipeline."

[09:22:39] <NodeX> http://docs.mongodb.org/manual/reference/aggregation/#_S_project <----

[09:22:52] <NodeX> doctoredPageViews : { $add:["$pageViews", 10] } ... check out that part

[09:28:22] <saby> NodeX yes we can pass them to the pipeline, but the problem comes when i need to perform a check condition

[09:28:28] <saby> if condition to be precise

[09:31:00] <NodeX> then I don't think it's going to work

[09:32:30] <ron> NodeX: you said you use the aggregation framework in 2.2 right?

[09:32:43] <NodeX> if the fields value is either 1 or 6 why can't you use that?

[09:32:46] <NodeX> ron : yeppers

[09:32:57] <NodeX> it's pretty fricken sweet dude

[09:33:40] <ron> NodeX: how big is your dataset? I know it's a complex answer, but I'd appreciate a general idea. I'm trying to grasp the performance mongodb gives with it as its map/reduce is less than optimal.

[09:34:38] <NodeX> I aggregate around 5M docs a day

[09:35:12] <NodeX> I do roughly 150k at a time to keep it fast

[09:35:20] <ron> what's their average size? do you do it in batch or online?

[09:35:41] <NodeX> online and they vary in size, let me past a sample doc

[09:35:57] <NodeX> pastebin *

[09:36:11] <ron> thanks.

[09:36:23] <ron> oh, and do you use a single machine or a cluster?

[09:36:40] <ron> and can you define 'fast'? :)

[09:37:03] <ron> sorry for all the questions, just trying to get a better grasp out of other people's experience.

[09:37:33] <NodeX> http://pastie.org/5066784

[09:37:40] <ron> my boss wants to compare that solution and amazon's m/r but we're short on time and I think going with mongo for the initial stage should be more than enough.

[09:38:06] <NodeX> single machine 16gb ram, quad core xeon, live sites on the box too

[09:38:16] <ron> okay, that looks fair. I imagine our docs are of a similar size.

[09:38:22] <NodeX> fast = about 3 seconds for 150k rows mebbe

[09:38:30] <NodeX> rows / docs

[09:38:43] <NodeX> my pipelines are rather large

[09:38:49] <ron> pipelines?

[09:38:50] <NodeX> so to speak :P

[09:39:03] <ron> what do you mean?

[09:39:06] <NodeX> yer, $match, $project etc etc

[09:39:19] <NodeX> the piplines are the things you feed the framework

[09:39:30] <NodeX> i/e match this then do this then do this then do this and so on

[09:39:39] <NodeX> it reduces down and down and down etc etc

[09:39:41] <ron> oh, I didn't read the framework's docs yet unfortunately.

[09:39:58] <ron> is it based on mongo's m/r or is it completely different?

[09:40:02] <ron> assuming you know..

[09:40:22] <NodeX> basically in that 3 or so seconds I get everything I need to know about my app - i/e every single thing that's been clicked or visited and also counts (totals) for them

[09:40:33] <NodeX> it's very differnet but similar concept

[09:40:39] <NodeX> in terms of reducing anyway

[09:41:14] <NodeX> you basicaly tell it to $match - similar to find(); .. then you tell it what to do with those found docs

[09:41:29] <ron> I guess the real question is how it performs, but I imagine that on a single server it doesn't matter.

[09:41:38] <NodeX> grouping (counting all views for each uri for example)

[09:41:46] <ron> sec, phone

[09:42:20] <NodeX> well in terms of performance I dont worry because I aggregate every ten mins or so and only aggregate today's data

[09:42:59] <NodeX> when I finish a day's aggregation I export $gt : 3 days data to json and store it in Amazon arctic

[09:43:12] <jordiOS> hello! I have document where I store the content (per language and key as) $model->content[$language]['title']. I am looking for using ensureIndex on title for each language but I can seem to find how. Any ideas or hints?? Thanks!

[09:43:20] <NodeX> so my collections stay smallish and that keeps performance nice and managable

[09:43:48] <NodeX> jordiOS : you'll have to loop it

[09:44:09] <jordiOS> Thanks NodeX!!

[09:44:29] <NodeX> if you had it in this format .. foo : {languages : [{en:....]} ... you couldve applied it to foo.languages

[09:44:43] <jordiOS> So there's no way to use something like ensureIndex('content.$.title')

[09:45:04] <NodeX> not that I know of

[09:45:19] <jordiOS> ok!

[09:45:21] <NodeX> positional operator needs a way to find itself

[09:45:22] <jordiOS> thanks!

[09:45:26] <NodeX> no probs

[09:47:06] <BlackPanx> any eta on 2.2.1 mongo ?

[09:47:17] <BlackPanx> it should come out previous week

[09:47:35] <BlackPanx> anyone knows maybe some more info about it ?

[09:47:48] <NodeX> no news yet, keep an eye on the blog

[09:47:56] <NodeX> twitter/facebook

[09:49:39] <ron> NodeX: I see, so the aggregations are ongoing, and the reports are generated on the previously aggregated results?

[09:49:55] <NodeX> exactly

[09:50:09] <ron> okay, so no by-demand aggregations occur.

[09:50:25] <ron> (except for a few simple aggregated aggregations obviously)

[09:50:26] <NodeX> I aggregate into a doc that's perhaps today's data then my app reads that data and generates a report

[09:50:33] <NodeX> no on demand nope

[09:50:44] <NodeX> you could do simple on demands easily

[09:51:01] <NodeX> say take the last 7 days of stats and aggregate them from these docs (not from the history)

[09:51:02] <ron> I imagine that not on a large dataset.

[09:51:13] <ron> right

[09:51:24] <NodeX> I take 5m docs or w/e and put that into some counted aggregated doc

[09:51:27] <ron> so you create an OLAP cube for that matter.

[09:51:32] <NodeX> basicaly

[09:51:59] <NodeX> if I then need to get results over time I can aggregate (or loop) the daily counters

[09:52:06] <ron> and I imagine that using Hadoop it would be the same.. aggregate on going data and not by demand.

[09:52:59] <ron> you just need to make sure that the data resolution is fine enough. I imagine that yours is daily since you don't need 'by hour' aggregations.

[09:53:29] <ron> do you need to aggregate several times for different views? for example, per day, per user, per url?

[09:53:57] <NodeX> http://pastie.org/5066866

[09:54:03] <NodeX> a daily docs might look like that

[09:54:09] <kali> i don't know if thie is relevant, but "saiku" saved my life a few monthes ago

[09:54:33] <kali> we generated daily and hourly aggregates and push them into a mysql. saiku slurps data from there

[09:54:43] <ron> saiku?

[09:54:45] <NodeX> ron : these are things that your business has to determine

[09:54:51] <kali> ron: it's a mondrian frontend

[09:54:59] <NodeX> I dont need per user becase this specific app doesn't require it

[09:55:08] <ron> kali: mondrian?

[09:55:09] <kali> ron: http://analytical-labs.com/

[09:55:09] <ron> :D

[09:55:23] <NodeX> I'm interested in per url / per action so I suppose it's the reverse

[09:55:31] <kali> ron: mondrian is one of the few semi-open-source olap solutions :)

[09:55:35] <ron> NodeX: assuming you would, how would you go about it? add another aggregation?

[09:56:03] <ron> kali: it works against RDBMSs only?

[09:56:15] <kali> ron: yes, it's very sql oriented

[09:56:40] <ron> well, OLAP is actually considered a very relational solution.

[09:57:24] <NodeX> that's a pretty sweet looking app kali :)

[09:57:38] <ron> well, right now I'm looking for a very minimal solution. I have a fixed set of reports and views I need to generate.

[09:57:45] <NodeX> ron, I just add other piplines as and when I need them

[09:57:50] <ron> I'm not looking for a front-end.. just generate a csv eventually.

[09:57:53] <kali> NodeX: it's far from perfect, smells very beta, but it helps.

[09:58:05] <ron> NodeX: okay, I definitely need to read the aggregation docs.

[09:58:14] <NodeX> if I need to go to older data I have to get it from Arctic which is very expensive

[09:58:20] <ron> kali: thanks for the info, any point of view helps.

[09:58:29] <NodeX> so my advice is to plan well in advance what you think you might need

[09:58:34] <ron> right now I'm not worried about data retention either.

[09:58:36] <NodeX> or ... have a dedicated box

[09:58:56] <NodeX> to keep things fast you really need to keep your collection down in size

[09:58:59] <ron> we'll probably have a dedicated server for the reports anyways, at least for now.

[09:59:04] <NodeX> that's the main reason I export

[09:59:24] <ron> I have a problem with the term 'size' though, since I have no real grasp of what's big what's small.

[09:59:38] <NodeX> 3 days of data for me is about right

[09:59:45] <NodeX> perhaps 10-15M docs

[10:00:05] <ron> shouldn't the size/complexity of the docs make a difference too?

[10:00:21] <NodeX> hence the "for me" part

[10:00:29] <NodeX> it might be 5 days or 1 days of data for you

[10:00:56] <NodeX> as your data grows you can adapt the length of time you keep it for

[10:01:03] <NodeX> (scaling :D)

[10:01:29] <NodeX> I traded off realtime for efficiency

[10:01:47] <saby> NodeX can we have collection join in mongo ?

[10:01:53] <NodeX> it's near enough realtime for my app, other apps I would do it once a minute

[10:01:58] <ron> of course. I'm not even looking for a long run solution. if I go with your method, I think it could hold us for at least a few months efficiently.

[10:02:03] <ron> saby: no

[10:02:04] <NodeX> joining 2 collections saby ?

[10:02:05] <martinrue> if I store a `new Date()` in mongodb, is it automatically converted to UTC timezone?

[10:02:40] <ron> NodeX: thanks for the info, it's very useful.

[10:02:48] <NodeX> martinrue : I think it's for your server time

[10:02:54] <NodeX> no probs ron .. hope it helps

[10:03:00] <ron> I hope so too :p

[10:03:11] <NodeX> couple of days coding and you'll know

[10:03:13] <kali> martinrue: this is a client issue actualy, what goes over the wire has to be an utc timestamp

[10:03:24] <ron> NodeX: pfft. I don't code. I tell people to code. :D

[10:03:25] <martinrue> NodeX: when I do `new Date()` in JS, that's local time… but in MongoDB is shows as UTC

[10:03:36] <ron> I believe it's UTC by default.

[10:03:39] <NodeX> ron : LOL

[10:03:44] <saby> NodeX in the previous scenario, where I was trying to use If conditions, instead of using that, I would be inserting another field like priorityMultiplier and the value would be pointing to an id number and the value of that id number would be present in another collection

[10:03:46] <martinrue> kali: does this mean the mongodb-native driver in JS is likely adjusting my local time dates to UTC before storing them?

[10:03:57] <ron> http://www.mongodb.org/display/DOCS/Dates

[10:03:59] <kali> martinrue: yes

[10:04:01] <saby> this leaves me room for having the value of that multiplier configurable for later purposes

[10:04:13] <ron> k, lunch time.

[10:04:24] <martinrue> thanks kali, ron, NodeX

[10:04:25] <NodeX> saby, it's not possible

[10:04:55] <NodeX> the computed variable you're looking for must come out of either a document or be projected to a variable

[10:05:05] <NodeX> (i/e the sum of somehting)

[10:05:26] <saby> so for that computation, I cannot fetch the value from another collection ?

[10:06:17] <saby> i would be storing the id number of the other collections document in $priorityMultiplier

[10:09:11] <NodeX> it's not possible to join or query another collection or document

[11:16:25] <fotoflo> hello all. I have a collection playlists with an property videos. some videos have a property comments -- how do i select those comments?

[11:17:52] <NodeX> just the comments?

[11:17:59] <NodeX> or you want to query the comments?

[11:18:51] <NodeX> db.foo.find( { some:"query" }, { comments : 1 } ); <--- that's just to get back the comments

[11:22:08] <fotoflo> db.foo.find({videos: {$exists : true} }, {comments: {$exists : true} } ) not working

[11:22:20] <fotoflo> i actually want to delete all the comments

[11:22:26] <fotoflo> but i should find them first

[11:22:50] <NodeX> that's an invalid query

[11:23:01] <fotoflo> Yes

[11:23:05] <NodeX> {videos: {$exists : true}, {comments: {$exists : true}

[11:23:05] <fotoflo> it is

[11:23:24] <NodeX> iknfact that too

[11:23:26] <fotoflo> still invalid

[11:23:30] <NodeX> {videos: {$exists : true}, comments: {$exists : true}}

[11:23:37] <fotoflo> still invalid

[11:23:40] <fotoflo> tried them all

[11:23:42] <NodeX> that's not invalid

[11:23:59] <fotoflo> oh

[11:24:01] <fotoflo> yeah it works

[11:24:14] <NodeX> db.foo.find({videos: {$exists : true}, comments: {$exists : true}});

[11:24:24] <fotoflo> but I'm not getting anything

[11:24:32] <fotoflo> i think its because the comments are in the videos

[11:24:36] <fotoflo> so it should videos.comments

[11:24:39] <fotoflo> but thats invalid

[11:25:24] <NodeX> pastebin a typical document

[11:29:14] <fotoflo> i can't do that, but i can tell you i would want to access playlist.videos[1].comments

[11:30:11] <NodeX> ok good luck

[11:46:11] <remonvv> Why on earth can't you paste an example document? The odds of someone running away with a schema that currently doesn't even work are slim at best.

[11:46:21] <NodeX> init

[11:46:22] <NodeX> lol

[11:46:43] <NodeX> I've decided not to help people who wont help themselves, they get a "Good luck"

[11:49:25] <NodeX> http://www.dailymail.co.uk/news/article-2218387/EXCLUSIVE-Gary-McKinnon-saved-extradition-Theresa-May.html

[11:49:26] <NodeX> :D

[11:49:43] <Bartzy> Hey - Does MongoDB 2.0+ offer any improvement on auto-reclaiming disk space ?

[11:53:38] <remonvv> reclaiming?

[11:54:53] <Bartzy> remonvv: Yes, after documents deletion

[11:55:19] <remonvv> If you mean what I think you mean, then no. MongoDB is not more efficient with reusing diskspace that is no longer used since 2.0+ that I'm aware of.

[11:55:46] <remonvv> There are many other valid reasons to upgrade though.

[11:58:52] <remonvv> NodeX, I agree.

[11:59:08] <remonvv> At least take the time to come up with a similar document if it's full of passwords or something

[11:59:38] <NodeX> something to blog about :P

[12:00:14] <NodeX> the ones I like the most is "Mysql can do it so why can't mongo .... or APache can do it so why not nginx" ... my answer is "Go use Mysql / apache then" LOL

[12:15:56] <remonvv> NodeX, yes. A had a reply similar to that on SO, it's quite popular ;)

[12:16:07] <remonvv> http://stackoverflow.com/questions/7501100/mongodb-not-that-faster-than-mysql/7501739#7501739

[12:55:53] <omie__> anyone there ? I am facing a very critical issue while making queries

[12:58:02] <ron> just ask your question.

[12:58:40] <omie__> suppose I have some data with keys/columns name,surname,city. If I try a query like: {'$or': [ {'name': {'$ne': 'xyz'}}, {'city': {'$ne': 'newyork'}} ]}

[12:59:19] <omie__> it doesnt work right. I mean what I expect it to do is exclude all the records if either name is xyz or city is newyork

[12:59:46] <omie__> however, it only excludes records where name is xyz AND city is newyork

[13:01:54] <ppetermann> is that your complete find?

[13:03:30] <omie__> it fails in opposite way too

[13:03:59] <omie__> i mean if I do { {'name': {'$ne': 'xyz'} }, {'city': {'$ne': 'newyork'}} }

[13:04:38] <omie__> it works like as if it was OR, takes out all the records where either of conditions is true

[13:05:50] <omie__> so far I know only this. you want me to investigate more ? how should I do that ? (I am new to mongodb, also non relational databases)

[13:07:17] <omie__> and behaviour is same in both 2.0.7, and 2.2

[13:37:09] <NodeX> omie__ : what's the problem

[13:37:40] <omie__> umm.. isn't that the wrong output ?

[13:37:54] <NodeX> ?

[13:38:02] <NodeX> no, I am asking what you need help with

[13:39:20] <omie__> should I start with beginning ?

[13:40:39] <NodeX> please

[13:40:46] <NodeX> i missed all of the first part

[13:41:35] <omie__> suppose I have 4 records in a collection. (john, newyork), (kelly, newyork), (john, chicago), (kelly, chicago)

[13:41:52] <NodeX> ok

[13:43:35] <omie__> now if I do find using- '$or' : { name: {$ne:john}, city:{$ne: newyork} }

[13:44:29] <omie__> I should expect it to show kelly,chicago only. right ?

[13:44:58] <omie__> but it will show me three records by taking out only john-newyork

[13:45:29] <omie__> and this is vice versa.

[13:46:36] <omie__> actually I am working with django-nonrel. Initially I thought problem is with django's filter() and exclude(). but later I checked the queries created by django and tried them on mongo shell

[13:46:51] <NodeX> no, it's doing this... (SQL) SELECT * FROM foo WHERE name !='john' OR city !='newyork'

[13:47:30] <NodeX> from your documents which are you trying to select and which do you want left out...

[13:50:10] <omie__> just a min

[13:50:11] <NodeX> The $or operator lets you use a boolean or expression to do queries. You give $or a list of expressions, any of which can satisfy the query.

[13:50:23] <NodeX> (from the docs) which means that anything can satisy it

[13:50:27] <NodeX> satisfy

[13:51:13] <NodeX> the query is malformed above too

[13:51:20] <omie__> okay, I will think of this again

[13:51:37] <omie__> can you take a look at this python script I used for testing: http://pastebin.com/TyTuJ29i

[13:51:40] <NodeX> $or : [ { name: {$ne:john}, city:{$ne: newyork} }]

[13:51:57] <omie__> and a screenshot of results: http://i.imgur.com/tpTO7.png

[13:52:07] <NodeX> I dont know python sorry

[13:52:56] <omie__> in image, on left its normal django using sqlite3 and on right its using django-nonrel (mongodb)

[13:53:08] <omie__> i ran same script but output differs for mongodb

[13:53:26] <saby> NodeX can we have where condition in aggregation framework ?

[13:53:31] <omie__> its fine. just look at that image, you will get what problem I am facing

[13:53:40] <saby> something like where(field1-field2)>somevalue

[13:54:39] <NodeX> Warning : You cannot use $where or geospatial operations in $match queries as part of the aggregation pipeline.

[13:55:32] <NodeX> omie__ : can you spit out the query that python is creating?

[13:55:41] <NodeX> i/e the raw shell syntax

[13:55:49] <omie__> yes, 1 minute

[13:57:52] <omie__> NodeX: for ANDing part exclude, its [{'sql': "<MongoQuery: {'city': {'$ne': 'Ratnagiri'}, 'name': {'$ne': 'Omkar'}} ORDER []>"}]

[13:58:11] <omie__> for OR'ing part exclude, its: [{'sql': "<MongoQuery: {'$or': [{'name': 'Omkar'}, {'city': 'Ratnagiri'}]} ORDER []>"}]

[14:00:13] <NodeX> ok and what do you expect to see

[14:00:54] <omie__> I expect the output same as on left. What has happened is o/p of AND-Exclude should be o/p of OR-Exclude

[14:00:59] <omie__> and vice versa

[14:05:31] <NodeX> so this is a python driver error?

[14:08:06] <omie__> I tried these queries on mongo shell and output is same.

[14:08:27] <omie__> is there anything wrong logically in the generated queries ?

[14:09:24] <NodeX> the $or is returning what it's supposed to

[14:10:56] <omie__> hang on, extremely sorry for this. I gave you the wrong query :-(

[14:11:00] <omie__> its [{'sql': "<MongoQuery: {'$or': [{'name': {'$ne': 'Omkar'}}, {'city': {'$ne': 'Ratnagiri'}}]} ORDER []>"}]

[14:11:30] <omie__> ^ for OR'ed exclude

[14:11:32] <NodeX> you have one to many

[14:11:39] <NodeX> [{'name': {'$ne': 'Omkar'}}<------, {'city': {'$ne': 'Ratnagiri'}}]}

[14:12:19] <NodeX> infact sorry, that's fine

[14:12:54] <omie__> yes, there are multiple 'Omkar' and multiple 'Ratnagiri' in data. is that what you asked ?

[14:13:36] <NodeX> I don't understand what you hope to achieve from it..

[14:13:43] <NodeX> or want to achieve

[14:14:50] <omie__> see, the current data I am showing you. I made it for simplicity. I am making a simple log-viewer. and there are really very simple filters I wish to apply

[14:15:33] <NodeX> that's great but you have not told me still what data you want back from the queries and why what it returns currently is wrong

[14:15:48] <omie__> okay. will come the point

[14:16:11] <omie__> for example: ( DEBUG, "some message", "actions.py", "line-no=28"). If there is any such thing in logs, I want to exclude it

[14:16:14] <omie__> when I do that

[14:16:22] <NodeX> "The $or operator lets you use a boolean or expression to do queries. You give $or a list of expressions, any of which can satisfy the query."

[14:16:26] <NodeX> notice the "any" part

[14:16:39] <omie__> it take away all the logs that match either of those conditions

[14:16:42] <omie__> yeah, fine

[14:17:08] <omie__> but the way it works is exactly opposite than relational databases

[14:17:21] <omie__> i mean, when that $ne is in place

[14:17:32] <omie__> my problem is, I can't make this code reusable then

[14:17:53] <omie__> if I stick to MongoDB's logic of boolean operation, it won't work if I switch my backend

[14:18:10] <NodeX> {'$or': [{'name': {'$ne': 'Omkar'}}, {'city': {'$ne': 'Ratnagiri'}}]} -> translates roughly to .... SELECT * FROM foo WHERE name!='Omkar' OR city!='Ratnagirl'

[14:18:36] <NodeX> if $ne : Okmar matches first it will be struck

[14:19:11] <NodeX> if you want SELECT * FROM foo WHERE name!='Omkar' AND city!='Ratnagirl' .... then that's a different query

[14:20:17] <omie__> I totally get your this point. but the problem is, its working opposite if compared to other backends.

[14:20:32] <NodeX> then use a different database

[14:20:57] <NodeX> WHat are you trying to achieve? .. make it work the same as other backends?

[14:21:19] <NodeX> can you show me in simple SQL what you are trying to do in Mongo

[14:21:55] <omie__> my code should be able to do that, right ? I mean I don't get why boolean logic should change if I switch backend

[14:22:42] <NodeX> the $or is certainly structured differently in Mongo but you can reverse it

[14:22:55] <omie__> is there any documentation about this ?

[14:23:15] <NodeX> about what - reversing your code to make mongo statements work in another backend?

[14:23:16] <NodeX> LOL

[14:23:26] <omie__> lol

[14:23:28] <omie__> no

[14:23:39] <omie__> about how $or works internally

[14:23:55] <NodeX> http://www.mongodb.org/display/DOCS/OR+operations+in+query+expressions

[14:24:03] <NodeX> I have pasted what it says 3 tiems

[14:24:05] <NodeX> times*

[14:25:18] <omie__> I can't show this chat log to my boss, hence link :-)

[14:25:48] <omie__> to be honest I don't mind staying with MongoDB right now. I'm much satisfied with it

[14:26:16] <NodeX> it's a nominal idea to be able to switch backends quickly but it bloats your code and is less efficient

[14:26:32] <NodeX> normaly backend swaps mean loading a different class - nice and simple

[14:27:21] <omie__> thats what django framework is supposed to do ! I only have to change backend string and no changes needed in ORM/my code

[14:28:13] <NodeX> and here lay the problems of frameworks "supposed" lol

[14:28:30] <omie__> hehe

[14:28:32] <NodeX> adding layers and layers of bloat to an app :S

[14:29:19] <NodeX> you might be quicker to write a Mongo to (insert backend here) mapper

[14:30:19] <omie__> hmm.. will look into that later sometime. for now I need to get this thing to work !

[14:30:27] <omie__> Thanks for the help

[14:30:31] <omie__> and time

[14:33:40] <NodeX> no probs, good luck

[14:51:24] <ThomasJ__> If I have a shard that's extremely low on disk space, what can I do to make sure it doesn't completely run out of space?

[14:55:16] <ThomasJ__> It's only this one shard that's full. Most of my other shards have lots of space

[14:58:40] <tncardoso> ThomasJ__: probably your shard key is not dividing content in a balanced way

[14:59:03] <bgilbert> Hey guys, I'm seeing a strange issue using the java driver......I keep running into the following exception: java.lang.IllegalArgumentException: response too long: 1634610484, when the driver tries to create a response from mongo

[15:00:01] <bgilbert> this seems to be happening at random, and happens in cases where mongo shouldn't be returning any results from the query......and the size reported is inconsistent and normally way larger than it should or could be

[15:00:45] <bgilbert> I'm guessing some form of corruption with the response.....has anyone seen an issue like this or have any suggestions on where I should look next to further debug this issue?

[15:00:52] <ThomasJ__> I tried draining that overloaded shard but I get "Can't have more than one draining shard at a time" (I am draining another one as well). Is this a hard limit on mongo?

[15:49:04] <Industrial> At what scope should I be opening connections to MongoDB though node-mongodb-native? application scope (1 connection per app) or request scope?

[15:58:23] <Mmike> I'm upgrading mongo in my cluster - i upgraded one secondary and I see 'sycing to' in it's web interface

[15:58:34] <Mmike> is there a way I can see how long will that last?

[15:59:00] <Industrial> what im really trying to avoid is wrapping my whole application in a db.open(cb). Is there a way to do this once? (Iḿ cleaning up with db.close on process.exit/SIGINT)

[16:04:57] <Mmike> Also, on the server i just upgraded I see it's syncing to another secondary. But on that (another) secondary, and on primary, I don't see 'syncing to' status on any box

[16:26:29] <Bartzy> Is it possible (through the drivers or otherwise) to try to insert a document, but if it exists (i.e. a unique index with that value already exists in the collection), just return the _id ?

[16:27:25] <algernon> Bartzy: not directly

[16:27:33] <Bartzy> algernon: Can you explain ?

[16:28:01] <algernon> Bartzy: but you can try an insert with safe mode, and if that returns an error, do a query on the unique index and return whatever you want.

[16:28:38] <rybnik> Hi there fellows, the mongoosejs channel is rather idle, so perhaps someone here can give me some advice on a "mongoose schema design", I've created a gist describing the problem, anyone care to give it a try ? https://gist.github.com/3900332

[16:28:46] <Bartzy> algernon: Why not try to query that unique index , and if nothing returns - inserT ?

[16:29:27] <algernon> Bartzy: because between the query and the insert, something may insert it

[16:29:57] <Bartzy> That is extremely unlikely, but correct :)

[16:30:14] <algernon> better be on the safe side.

[17:01:13] <twrivera> What is the best practice for storing dates in mongodb? epoch values?

[17:04:01] <kali> mongo and bson have a timestamp value, which is a millisec offset from epoch

[17:04:47] <kali> twrivera: drivers map this natively to the language time class

[17:04:53] <kali> so it's probably your best option

[17:10:52] <thewanderer1> hi. let's say I have a movie database, each movie has an "actors" array. now, I want to obtain a list of actors and movies each has played in. how would I do that?

[17:11:25] <thewanderer1> it's easy to do if I want the movies in which "Tom Thanks" has played, but what if I need this for all actors? looping through them all and issuing queries like mad doesn't sound wise.

[17:12:19] <kali> thewanderer1: you can look at the aggregation framework if it is an occasional query

[17:13:08] <thewanderer1> aha, so essentially map-reduce?

[17:13:56] <kali> thewanderer1: mmm... well, the implementation is actualy very different. m/r relies on a JS interpreter, whereas the AF is a native implementation

[17:14:20] <thewanderer1> the most straightforward way is to create <actor,movie> pairs and then reduce that to actor documents, I imagine... would the AF do that?

[17:14:39] <kali> thewanderer1: that said, if the DB is too big, the AF may not be enough

[17:14:44] <thewanderer1> (and it doesn't really sound very efficient, maybe I'm not using the right tool for the job... SQL comes to mind as "righter")

[17:15:14] <kali> nope, it's not efficient, you're basicallu pulling the entire db

[17:15:35] <kali> but SQL would not be much more efficient

[17:15:44] <twrivera> kali: i'm getting the date values in json format for ex: /Date(1250121600000)/ from an API and I want to insert that value into mongo docs

[17:15:46] <kali> the "ugly" join is just more visible in mongo

[17:16:25] <kali> twrivera: what kanguage are you pulling the data from ?

[17:16:29] <thewanderer1> hopefully SQL would have a way to optimize the join

[17:17:01] <twrivera> kali: pymongo python

[17:18:03] <kali> thewanderer1: not really, you're really hitting an algorithmic problem. in bost cases, you're down to a sort of one of the collection, so you're O(n*log(n))

[17:18:15] <kali> thewanderer1: sql does not do miracles

[17:18:31] <kali> thewanderer1: it just hides you from the actual ugliness of the world :)

[17:19:28] <thewanderer1> well, yes, the datasets (actors, movies) would need to be sorted in both SQL and Mongo

[17:20:11] <kali> twrivera: i can't help with python, all i remember about manipulating dates in python is a lot of pain

[17:21:22] <thewanderer1> actually, it's not that bad... I only wonder about the space needed (disk, RAM). would Mongo or SQL have any advantage in this regard?

[17:22:00] <Mmike> where do I change the number of connections for mongodb?

[17:22:08] <Mmike> i have maxconn in my conf file, but I still get this:

[17:22:11] <Mmike> cluster2:SECONDARY> db.serverStatus().connections

[17:22:11] <twrivera> kali: I'm learning that the hard way but how would you store the dates in the db? as string or timestamp object? I figured out how to convert the epoch to a string and I can convert from there

[17:22:11] <Mmike> { "current" : 819, "available" : 0 }

[17:23:25] <kali> thewanderer1: to be honest, the thing that strikes me is... mongo and *sql are more designed for "small" queries that impact only a small part of your dataset

[17:24:08] <kali> twrivera: definitely timestamp.

[17:26:31] <twrivera> kali: roger that! muchas gracias for the advice. ttyl

[17:26:39] <twrivera> i'm out

[17:26:55] <thewanderer1> kali, I'm asking all this because I'm writing a task assignment system (human resources, all that stuff), and need to show who's doing what, etc... and aggregation plays a key role here

[17:27:31] <thewanderer1> and Mongo seems really great because of document storage (rich information), only aggregation so far is troublesome

[17:28:01] <kali> thewanderer1: sure. you need to have a look at the aggregation framework. it's much better than map reduce, but will explode if the dataset is too big

[17:28:53] <thewanderer1> kali, define too big? O.o

[17:29:43] <kali> thewanderer1: "If any single aggregation operation consumes more than 10 percent of system RAM the operation will produce an error."

[17:29:58] <kali> thewanderer1: http://docs.mongodb.org/manual/applications/aggregation/#limitations

[17:30:01] <thewanderer1> aha, okay

[17:30:13] <thewanderer1> so it looks like for now, map-reduce is the safer way

[17:30:56] <kali> thewanderer1: you should only use map reduce when you can not use the AF

[17:31:43] <kali> thewanderer1: AF limitations are even worse. no concurrency, and it's slow as hell

[17:31:45] <Mmike> heh, ulimit :/

[17:32:20] <kali> thewanderer1: i meant M/R not AF in my last sentence

[17:33:45] <thewanderer1> uh, okay

[17:35:05] <thewanderer1> so it does seem that AF is the go-to technology in this case... now I only need to get Mongo >=2.1 (Debian stable is still at 1.4) and I'm all set. thanks!

[17:36:03] <kali> thewanderer1: yes. AF is designed to avoid 99% of m/r use cases, and for good reasons

[17:46:55] <Mmike> How do I know when replica is in sync with its primary?

[17:47:09] <Mmike> I always seem to se 'syncing to' on the _replSet status page

[17:49:32] <ralphholzmann> why is a three member replica set recommended?

[17:50:06] <kali> Mmike: compare the optimes

[17:50:23] <kali> ralphholzmann: you want a third one, or an arbiter. two machines is the worst setup ever

[17:51:15] <_m> Mmike: To get a quick idea, try this command from the mongo shell: db.printSlaveReplicationInfo()

[17:51:33] <ralphholzmann> kali: its basic a master - slave then?

[17:51:39] <ralphholzmann> *basically

[17:52:57] <kali> ralphholzmann: the problem is that when one of the server can't talk to the other, it has no way to make the difference between a network split or a host failure

[17:53:15] <kali> ralphholzmann: so in order to avoid the dreadfull split-brain, it has to go down too

[17:53:16] <Mmike> kali, where do I get those? I'm checking the _replSet web page on each server, but they doesn't seem to give me the same data. On srvA i see that srvB is syncing to srvC, but I don't see that on srvB and srvC

[17:53:42] <Mmike> _m, i get something like this: source: ded778:27017\n syncedTo: Tue Oct 16 2012 13:44:23 GMT-0400 (EDT)\n = 340 secs ago (0.09hrs)

[17:53:43] <kali> Mmike: in the array at the top ?

[17:53:45] <Mmike> does that means it is still syncing, or that is synced?

[17:54:46] <ralphholzmann> kali: ah, I understand

[17:55:00] <ralphholzmann> so would you then also suggest that I keep my servers in separate data centers?

[17:55:03] <ralphholzmann> I'm on linode

[17:55:43] <kali> ralphholzmann: errr... that's a whole new topic, here :)

[17:55:58] <ralphholzmann> :)

[17:57:01] <kali> ralphholzmann: i'm sure you need an arbiter or a third server. as for datacenter, it comes with its bundle of costs and issues. i won't answer that question for you

[17:58:04] <kali> Mmike: what state your node is in ? RECOVERING ? or SECONDARY ?

[17:58:10] <ralphholzmann> i appreciate your help kali

[17:58:22] <Mmike> kali: SECONDARY

[17:59:16] <kali> Mmike: and the optimes are consistent in the table at the top ?

[17:59:52] <Mmike> not sure what is 'table at the top'

[18:00:45] <kali> Mmike: in the /_replSet, at the top of the page, there is a table with the state of your servers

[18:01:34] <Mmike> yes, true

[18:01:42] <Mmike> that's the table that's not consistend between the servers

[18:01:48] <Mmike> optimes are different

[18:01:57] <Mmike> and the 'message' column is also not the same

[18:02:44] <kali> Mmike: what version transition are you actually trying to do ?

[18:02:55] <Mmike> 2.0 to 2.2

[18:03:17] <Mmike> ded778:27017 (me) 1 1 25 mins 1 1 SECONDARY syncing to: ded763:27017 507d9cf7:d8

[18:03:32] <Mmike> that is on ded778, but on other boxes, under ded788, I don't see that 'syncing to'

[18:04:32] <kali> Mmike: can you make snapshots of these tables and paste them somewhere ?

[18:06:30] <Mmike> sure, sec

[18:09:14] <ralphholzmann> kali: http://docs.mongodb.org/manual/administration/replication-architectures/#geographically-distributed-sets

[18:09:16] <ralphholzmann> =)

[18:11:02] <kali> ralphholzmann: yep. start by adding an arbiter, your two-node cluster is making me nervous

[18:11:49] <ralphholzmann> kali: well, I currently have a single instance doing periodic mongodumps

[18:11:58] <ralphholzmann> :)

[18:12:09] <ralphholzmann> trying to figure out the best way to start replication

[18:12:36] <kali> great. now i'll have nightmares

[18:13:08] <Mmike> kali, http://mmike.dyndns.org/~mario/mongo/

[18:14:24] <kali> Mmike: irk

[18:14:45] <kali> 778 is syncing to the primary, and it is stuck

[18:14:57] <kali> and 761 is syncing to 778

[18:15:02] <kali> so it's fine but late

[18:15:12] <Mmike> you mean - from the primary (it's pulling data from primary?)

[18:15:32] <kali> yes

[18:15:43] <Mmike> wgere did you get 761 is syncing from 778?

[18:15:52] <Mmike> aren't they both pulling data from 761?

[18:16:04] <Mmike> erm, 763

[18:16:11] <kali> Mmike: i don't trust the message :)

[18:16:29] <Mmike> wise :)

[18:16:49] <Mmike> so, I should NOT stepDown primary and upgrade it just yet?

[18:16:52] <kali> Mmike: are they stuck at these value, or are they atching up slowly ?

[18:17:06] <kali> i mean this: 507d9cfb:64

[18:17:35] <kali> you haven't make the screenshot in the same millisec, anyway, so i guess they are stuck

[18:17:50] <kali> so, no, i would not step down the primary :)

[18:20:46] <Mmike> thnx for the explanation

[18:20:57] <Mmike> so, I basically wait for the opcode numbers to be the same before I do stepdown?

[18:21:16] <kali> Mmike: well, are they even moving ?

[18:21:30] <kali> Mmike: the "Lag" figures above the table is the difference

[18:21:38] <kali> Mmike: are your secondaryies catching up ?

[18:21:43] <Mmike> they are

[18:21:46] <Mmike> lag is now 23 secs

[18:22:07] <Mmike> 26 secs on 778 and 560 secs on 761

[18:22:33] <kali> yes, wait for the figures to as low as possible

[18:22:38] <kali> +be

[18:23:23] <Mmike> yes, they won't get to 0, it seems :/

[18:23:53] <kali> you were at 1450s. 22s is not that bad, but you'll loose the updates made in the lag

[18:24:36] <kali> it sounds like your cluster is somehow at the limit of its capacity actually

[18:24:53] <mheld> hey y'all

[18:25:21] <mheld> is there any way to manually open and close mongo connections in java (as opposed to using the connection pool)?

[18:37:43] <Mmike> kali, well, someone thought it would be neat for mysql and mongodb to share the hardware

[18:37:51] <Mmike> so they're fighting for I/O

[18:38:09] <Mmike> thnx for the info, learned a lot today :)

[18:38:21] <jmar777> any chance that someone has some input on http://stackoverflow.com/questions/12917943/casbah-java-mongodb-driver-java-lang-illegalargumentexception-response-too-l ?

[18:38:31] <kali> Mmike: that's a terrible idea

[18:38:42] <Mmike> kali, you don't say

[19:03:15] <uroborus_labs> Probably a dumb question, but what would be the most appropriate way to store appointment book data in mongo?

[19:06:31] <wereHamster> in a collection

[19:13:41] <_m> uroborus_labs: Have you perused the docs on storing documents in mongo? You'll probably want a field to represent a 'user id' and fields for whatever appointment data you care to store.

[19:14:32] <uroborus_labs> I think my main question whould be, how much data to actually store in the document itself or how much to infer from the application itself

[19:15:04] <uroborus_labs> For example, if a doctor has availability from 9am to 5pm

[19:15:15] <uroborus_labs> In one hour blocks

[19:15:29] <uroborus_labs> Would you actually store that there is an empty slot?

[19:16:44] <uroborus_labs> Or would you store availability and what appointments have been made and then show what slots are open from that data on the application side

[19:47:30] <pgcd> if anybody's using django-mongodb: is there any (API-based) way of finding an object in a list of embedded objects? I mean, if I have the usual Post with comments, can I find a specific comment by ID?

[20:24:50] <ashley_w> i have some strangeness in my program since i added some code yesterday: http://pastie.org/5069647

[20:39:57] <jmar777> i know there used to be some issues with using indexes on a count() operation. is that still the case?

[21:47:55] <Bilge> Soooooooooooooooooo... I accidentally just rm -rf my mongodb dir

[21:48:02] <Bilge> But it seems that it still has everything cached

[21:48:11] <Bilge> Is there some way I can flush it back to disk before it truly is all gone?

[21:49:19] <mgriffin> Bilge: do not stop the service!

[21:49:28] <Bilge> Well... no shit!

[21:49:35] <mgriffin> http://www.hackinglinuxexposed.com/articles/20020507.html

[21:50:51] <mgriffin> by do not stop, i mean also perhaps disable cron/daemons that can do that (monit, etc)

[21:51:11] <Bilge> heh... you just happened to have that article handy, eh?

[21:51:17] <mgriffin> nope.

[21:51:33] <mgriffin> search engine: recover files from /proc linux

[21:51:50] <mgriffin> though i have helped many fix this type of oops

[21:51:59] <mgriffin> and knew the "entry point"

[21:52:30] <Bilge> Well lsof -c mongod doesn't produce anything

[21:52:54] <mgriffin> lsof -n | grep mongo

[21:53:00] <rybnik> Bilge: just a quick question, I'm rather interested in this…. if you perform a db.coll.find() will this have any negative impact on the cache ?

[21:53:23] <Bilge> never mind forgot to run sudo

[21:53:51] <Bilge> rybnik: I'm not really interested in experimenting *right now*

[21:54:06] <rybnik> didn't want you to, I was just wondering

[21:54:14] <Bilge> Sweet, I can grep (deleted) and they're all there

[21:54:39] <mgriffin> do not use cp -a if you are in the habbit of doing so..

[21:55:06] <mgriffin> backing up a broken symlink is not so useful

[21:55:16] <Bilge> I have no idea what -a is

[21:55:37] <mgriffin> you should be in the habbit of using -a with cp ;)

[21:56:33] <Bilge> o rly

[21:57:45] <Bilge> I'm not really sure why

[21:57:45] <mgriffin> probably also i would want to do something like mongodump or something that does a logical export of the data (i dont know how to back up mongo well, someone else comment)

[21:57:45] <Bilge> Although cp -p seems useful

[21:59:14] <mgriffin> fair enough

[22:01:08] <mgriffin> yeah, it seems that mongodump is a really good idea probably, since it seems to: connect over socket, request data from running instance

[22:01:38] <Bilge> Well I managed to copy my database back from the file descriptors like in the article

[22:01:46] <Bilge> That should be sufficient

[22:01:56] <mgriffin> i would mongodump probably

[22:02:20] <Bilge> Can't hurt

[22:03:03] <Bilge> As it happens this is just dev data anyway so it's all expendible, but it would still be a pain to lose it. I'm really impressed by your knowledge mgriffin :)

[22:03:32] <mgriffin> glad i could help

[22:03:56] <Bilge> There's no reason to copy the journal/lock is there?

[22:04:11] <mgriffin> i seriously know nothing about mongo

[22:04:13] <mgriffin> :D

[22:04:14] <Bilge> lol

[22:04:40] <Bilge> Good job you know about linux

[22:04:51] <rybnik> mgriffin and Bilge thanks for the interesting problem/solution :)

[22:05:34] <Bilge> Yeah I'm actually glad I deleted the wrong directory because I learned something invaluable in the process

[22:05:47] <mgriffin> that backups save jobs?

[22:11:43] <Bilge> No, that Unix variants without procfs really are inferior

[22:14:08] <mgriffin> Bilge: they probably have superior filesystems with undelete ;)

[23:06:22] <cjhanks> So the C++ doxygen is entirely out of date an incomplete, is there any better source?

[23:19:32] <cjhanks> And a = Date_t( ); a == Date_t(a.toTimeT()) --> False. When querying. Ech...

[23:44:27] <kenyabob> new guy here. I've used php to create a list of json objects I want to import into mongodb, not sure what the simplest way to to just feed in this comma delineated list of objects.

Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 16th of October, 2012