pmxbot IRC Log Viewer

[00:31:22] <Moussekateer> Hi guys

[00:32:21] <Moussekateer> I'm having an issue using pymongo. I have a field with an integer zero value but pymongo returns zero results when I do find({'field': 0}). How do I get round this

[01:36:12] <carlzulauf> Given a collection containing values like 'changes: {"rent": [125, 150]}' ... how can I find documents where rent[0] is 125, but rent[1] is not ?

[01:38:44] <carlzulauf> not sure how to match the first element in an array

[01:40:57] <carlzulauf> is using $where the only way?

[02:03:31] <TimRiker> using the java mongo driver, trying to connect with a mongouri. I keep getting a number parse error. does this work for anyone?

[02:12:58] <TimRiker> ah. found it.

[04:32:39] <quintus> hey, would anyone know how query the keywords field for the value "word2"? Example: {article: name="article", keywords=["word1", "word2", "word3"]}

[04:32:51] <quintus> i haven't been able to get $in to work

[07:10:31] <Hirzu> Hi, does someone know how a total newbie could find docs to help do following: I have a document in Mongo and this document contains collection of key value pairs. I have a set of key value pairs that should be sent to the database and then those key value pairs should be matched with the ones stored in the database + averages counted for those values.

[07:14:15] <CAMDesigns> Hey All,

[07:14:36] <CAMDesigns> i'm using the PHP mongo driver and inserting into the DB

[07:14:48] <CAMDesigns> but my numbers all defaultto strings

[07:15:01] <CAMDesigns> any ideas on how to force numeric

[08:03:13] <svm_invictvs> Is a DBCollection object particularly "Heavy" in mongodb?

[08:03:50] <svm_invictvs> IN other words, would it be bad practice to repeatedly create an instance of DBCollection frequently or for just one or two method calls?

[08:11:13] <Hirzu> Hi, does someone know how a total newbie could find docs to help do following: I have a document in Mongo and this document contains collection of key value pairs. I have a set of key value pairs that should be sent to the database and then those key value pairs should be matched with the ones stored in the database + averages counted for those values.

[08:11:30] <Hirzu> Or should I use RDBMS for this purpose?

[08:12:29] <ron> You should be able to do that with either map/reduce or probably better yet, the aggregation framework.

[08:12:49] <Hirzu> is there any documentation to give examples on how to do that?

[08:14:06] <ron> http://cookbook.mongodb.org/ and http://docs.mongodb.org/manual/applications/aggregation/

[08:14:17] <Hirzu> I've not been able to find anything that explains to me how to do that. I might be just too dumb or something.

[08:14:52] <ron> No, you just need to understand the concept rather than expecting the solution to be laid out.

[08:17:22] <Hirzu> the concept seems to be for blogs, when watching presentations

[08:18:01] <Hirzu> but Mongo actually seems to have a promise for something bit more. The learning curve is just so deep for me

[08:18:38] <daslicht> How to structure data in mongodb which is comparable to mysql foreign keys ?

[08:18:38] <daslicht> eg in MySQL I would have 2 table User and UserGroup

[08:18:38] <daslicht> How would i do this in MongoDb ? How to deal with future updates of separated MongoDB Documents?

[08:18:59] <NodeX> it will be 2 queries

[08:19:05] <NodeX> and you have to deal with it appside

[08:19:08] <ron> daslicht: you don't. keep it all in one place.

[08:19:34] <daslicht> if i keep it all in one place i would have to update all things manually in future ?

[08:19:42] <ron> I find that MongoDB + external indexing is a good combination.

[08:20:16] <ron> yes. that is the cost of using NoSQL. But it also depends on your usage.

[08:20:32] <daslicht> ok cool

[08:20:39] <daslicht> so i got it right :)

[08:20:50] <daslicht> how do you deal with external indexing?

[08:21:06] <ron> You should model your data based on the operations you're doing most frequently.

[08:21:30] <ron> you can have a mash of data collections if you need.

[08:21:46] <ron> duplicating some data.

[08:21:56] <ron> you just... index it. externally. :)

[08:22:32] <svm_invictvs> When setting the id of a DBObject in Mongo, is it $id or _id?

[08:22:34] <svm_invictvs> I can't recall

[08:22:56] <NodeX> ObjectId == _id

[08:24:21] <daslicht> ron, what do you mean by external indexing ? how do you solved it technically :)

[08:24:44] <ron> daslicht: using solutions such as Lucene, Solr, ElasticSearch, Sphinx.

[08:25:13] <daslicht> huh never heard of them

[08:25:15] <daslicht> :)

[08:25:17] <daslicht> I google

[08:25:32] <NodeX> they dont really have anything to do with trying to solve your user/group problem

[08:25:40] <ron> of course you should use them only if you need them. if you can solve everything you need with Mongo's capabilities, that's great. Mongo has some great features in it.

[08:25:43] <daslicht> its no problem

[08:25:51] <daslicht> it was just an example

[08:26:20] <daslicht> i just shoed this example to verify my knowledge

[08:26:37] <daslicht> so at the moment i would just save a usergroup in the user doc

[08:26:39] <daslicht> eg

[08:26:42] <ron> they _could_ be related to the user/group problem, depending on the complexity you want to achieve. for a simple use case, you won't need it.

[08:26:45] <daslicht> without a second document

[08:26:57] <ron> depends on what usergroup includes

[08:27:25] <NodeX> daslicht : the one thing to remember with Mongo / Most nosql solutions is that ... one size never fits all

[08:27:30] <NodeX> and it never will

[08:27:53] <ron> also, denormalization is quite common.

[08:27:59] <daslicht> :)

[08:27:59] <NodeX> this is the main reason that SQL solutions don't scale - because they try to fit every model

[08:28:04] <ron> which is the opposite of what you're used to with relational databases.

[08:28:43] <daslicht> do nosql tend to have redundant data stored and thererfore tend to take more space ?

[08:28:53] <ron> yes.

[08:29:23] <daslicht> what are the advantages of using nosql besides its easier handling ?

[08:29:27] <ron> one of the reasons nosql databases gained traction is that disk storage nowadays is significantly cheaper than it was 10 and 20 years ago.

[08:29:53] <ron> the ability to scale out, performance, lack of several relational restrictions.

[08:30:06] <NodeX> for me the advantages are speed, ease of use, scalability, faster development to name a few

[08:30:23] <ron> of course there are also disadvantages.

[08:30:26] <daslicht> great

[08:31:17] <daslicht> i have just gained some experience with nodejs + mongo and i love it

[08:31:17] <NodeX> I never ever used SQL with joins or in a relational way when I was using it because I could always see that as a bottleneck so Mongo was a natural fit for my way of programming

[08:31:18] <daslicht> much less acrobatics than using doctrine or mysql with php

[08:31:18] <daslicht> :)

[08:31:18] <NodeX> programming/thinking

[08:31:34] <daslicht> *feels home*

[08:32:42] <daslicht> GridFS is used to store binary data as document correct?

[08:33:03] <NodeX> correct

[08:33:21] <svm_invictvs> daslicht: yeah, from what I read (never used it) it's basically a filesystem built on top of documents in the database

[08:34:15] <daslicht> any experience with gridfs vs storing just the path of a file ?

[08:34:16] <svm_invictvs> daslicht: Whether or not it's the "right" filesystem to suit your needs that's a different story

[08:34:19] <NodeX> it has 2 parts, the chunks and the index of the chunks (in simplest terms)

[08:34:50] <NodeX> daslicht : I use both and gridfs is certainly more portable and infinitely scalable

[08:35:03] <daslicht> ok

[08:35:10] <NodeX> plus you can always add a cache in the way and there is no direct link to a file on the FS

[08:35:11] <[AD]Turbo> hola

[08:35:19] <daslicht> i come back to it once i have mastered more node things

[08:35:43] <svm_invictvs> There's a fuse driver for it too, heh

[08:35:43] <daslicht> Thank you very much !

[08:58:10] <Aartsie> Hi all!

[09:05:40] <Aartsie> i got this error: Cannot overwrite `Users` model once compiled. in my node.js script i use mongoose but i don't know how to fix it :( can somebody tell me how to fix this ?

[09:31:48] <sharad> hi

[09:32:22] <Aartsie> Hi

[10:04:04] <megawolt> hey

[10:04:43] <megawolt> i have questions about c# driver and connections

[10:05:22] <megawolt> is there anyone can help?

[10:08:07] <NodeX> better to just ask and wait than ask to ask

[10:08:30] <megawolt> thanx you're right

[10:09:25] <megawolt> i have a replica set

[10:10:03] <megawolt> i succesfully connect with simultaneous with c#

[10:10:43] <megawolt> but i'm getting "Unable to connect to a member of the replica set matching the read preference Primary" Error 1 of 3 times

[10:10:53] <megawolt> could anyone help me?

[10:13:28] <mandark> Hi there, I have a strange behavior here with '$exists' being inserted, is it a known error / bug / misunderstanding : http://pastebin.com/rZZ3wQaq ?

[10:14:23] <remonvv> db.version()

[10:14:25] <remonvv> ?

[10:14:41] <mandark> remonvv: 2.2.2

[10:15:00] <megawolt> db version 2.2.2

[10:15:41] <mandark> remonvv: And pymongo version 2.4

[10:15:49] <remonvv> It is due to the upsert.

[10:16:45] <mandark> remonvv: But, why ?

[10:16:58] <megawolt> mandark : the error seen in getting data not inserting state

[10:16:59] <mandark> remonvv: It seems to have the correct behavior with {'_id.url': {'$exists': False}, '_id.year': 2012}

[10:17:18] <remonvv> mandark: It's..well let's call it a bug.

[10:17:57] <remonvv> mandark: No, that's not correct behaviour. The reason it goes "wrong" is because the $exists condition in one case fails and thus it becomes an upsert rather than an update.

[10:18:13] <remonvv> If you try db.foo.insert(doc) with your original case you'll see the error this operation should give.

[10:18:17] <remonvv> That it doesn't is a bug.

[10:18:40] <remonvv> What happens is this : find(doc) == notfound -> "raw insert"(doc)

[10:18:55] <remonvv> oversimplified. That last step doesn't seem to validate field names.

[10:19:07] <mandark> remonvv: Ok so why does it seems to work with {'_id.url': {'$exists': False}, '_id.year': 2012} ? Cause they tempered the bug removing '$exists' at only the root level ?

[10:19:10] <remonvv> Feel free to report it on jira.mongodb.com

[10:19:57] <remonvv> mandark, because that query will always evaluate to finding a document if the collection is non=empty

[10:20:04] <remonvv> and as such wouldnt trigger the upsert code path

[10:20:07] <remonvv> trigger=follow

[10:21:20] <remonvv> That last query is basically saying "find all documents where _id:url is not set. I'm assuming at least one document matches that criteria in your database.

[10:21:25] <remonvv> (test with find(doc)

[10:22:00] <kali> remonvv: i would say the root cause is the query is invalid: the "matching" part should be either a valid value, either a "matcher" ($elemMatch in this case)

[10:22:31] <kali> remonvv: here we have mix that will never match anything

[10:23:03] <kali> it's harmless in a find, but obviously, the upsert is a loophole

[10:23:04] <remonvv> There is nothing wrong with the query is there? Any query object that find(c) accepts is valid for update(c, u, false, true)

[10:23:25] <mandark> remonvv: Do you mean both syntaxes have different meaning ?

[10:23:30] <kali> remonvv: i don't think you can use $exists this way

[10:23:52] <remonvv> What way? checking for the existence of an embedded field or doing it on an _id field?

[10:24:17] <remonvv> The latter is useless ofcourse in the case of an upsert but it should definitely throw an error.

[10:24:20] <kali> remonvv: i think the syntax assume the "value" part of the selector is either a "valid value" or a matcher (something starting with $)

[10:24:24] <mandark> remonvv: Don't know, for me they are almost equivalent

[10:25:13] <mandark> Tricky question, how does mongo can work with an upsert with '{foo: {$gt: 42}}' ? It inserts {foo: 43} to match the condition ? ^-^

[10:25:15] <kali> remonvv: if you write theat with an $elemMatch, it becomes correct { _id: { $elemMatch : { year: 2012, url : { $exists: false} } } }

[10:25:16] <Aartsie> Is it better to build a model for a find and a new model for a save or can i create one model for both ?

[10:27:02] <kali> mandark: it creates a doc without "foo"

[10:27:19] <kali> mandark: http://uu.zoy.org/v/gutobi#clef=brlkolcdqmuurbet

[10:27:34] <remonvv> kali: $elemMatch is an array operator. That's possibly even more broken if anything.

[10:27:57] <khushildep> Hi Chaps. Trying to do a mongorestore and I get " what(): locale::facet::_S_create_c_locale name not valid" - my locale is all set for en_GB.UTF-8 - anyone come across this?

[10:28:16] <mandark> kali: That's a good solution but that's false as it does not match the query ...

[10:28:26] <khushildep> Mongd is db version v2.2.2, pdfile version 4.5

[10:28:31] <kali> remonvv: indeed

[10:29:09] <remonvv> kali: I would prefer MongoDB erroring out on that one.

[10:29:21] <remonvv> kali: Or, if pressed, they document exactly what happens in such edge cases

[10:29:35] <kali> remonvv: ho, yeah, that should explode, i aggree

[10:29:51] <kali> remonvv: either for an invalid query, or for an invalid query to upsert with :)

[10:30:52] <remonvv> kali: The latter I think ;)

[10:32:10] <remonvv> Trying to think of valid cases where you'd want an _id based query for an upsert.

[10:32:47] <kali> remonvv: the _id is not the problem... it's the fact you're upserting without providing a fully defined document

[10:33:01] <remonvv> kali: I know, it was a seperate train of thought.

[10:33:08] <kali> ha ok :)

[10:33:26] <remonvv> kali: As in, can you think of useful scenarios where you'd do update({_id:.., other:..}, {some update}, false, true)?

[10:33:55] <remonvv> kali: I feel if you find yourself doing that you have a schema issue. But that might be my lack of imagination ;)

[10:35:17] <kali> remonvv: i have cases with upsert looking like { prop1 : value1, prop2: value2, ... propN: valueN }, { $inc: { counter: 1 }} with a unique index on prop1... propN

[10:35:25] <remonvv> kali: Perhaps if your application is in a position to set the _id to an appropriately unique value, but meh : update({_id: UUID}, {$set:{name:"Remon", level:"noob"}}, false, true)

[10:35:31] <kali> remonvv: it could be implemented with an hybrid id

[10:35:52] <remonvv> kali: True but it's better the way you do it compared to using the _id really

[10:36:27] <kali> remonvv: depends, it can be argued i'm wasting time and space maintaining two indexes instead of one

[10:36:28] <mandark> For interested people about this bug, i opened it here : https://jira.mongodb.org/browse/SERVER-8407

[10:36:31] <remonvv> kali: That said I dislike compound _id values to begin with. Gets messy.

[10:37:00] <remonvv> kali: Can it? Same memory usage give or take and the compound index is less flexible than two smaller indexes.

[10:37:34] <remonvv> kali: Well, inserts/updates are slower I suppose.

[10:37:37] <kali> remonvv: well, i need the compound index for my updates to be fast

[10:38:13] <kali> remonvv: this is counting hits on web services endpoints according to a collection of criteria, so write intensive use case

[10:38:23] <remonvv> kali: Yeah if all fields in the compound have high cardinality that's your only route.

[10:38:56] <remonvv> kali: Fair enough. All the more reason to know what happens when inside MongoDB so people can pick the appropriate schema and indexes. More should be written about that sort of thing.

[10:40:31] <kali> ho, i think there is enough said and written about his actually. but most people are using mongodb without reading or watching the material

[10:41:04] <kali> or take the time to learn about just one bit too late :)

[10:44:33] <remonvv> Hm, I look for new blogs and articles on a weekly basis and most are fairly basic. If you look at all the @10gen posted tweets about new blogs and presentations for example a lot of people are basically rehashing the same stuff.

[10:44:46] <remonvv> I'm sure there are exceptions.

[10:45:49] <kali> remonvv: i think most people haven't grasped the basics of btree indexing and what are its consequences in terms of prefixing, ranges queries, ordering and so on

[10:46:31] <kali> remonvv: every official mongodb event i have seen included a presentation on this very topy by a 10gen staffer, so... i think they're trying as much as they can

[10:48:15] <remonvv> kali: Fair enough. All the information I gathered on these topics a year or two ago came from webcasts of 10gen presentations.

[10:49:14] <remonvv> kali: Although that's only a relatively small part of the solution. Documentation and whitepapers are more useful references and that still seems to lack a bit. Online manual for example is rather messy, and that's me being charitable.

[10:50:14] <kali> yeah, i hate the online manual too

[10:51:59] <remonvv> kali: Yep, room for improvement ;)

[10:52:45] <Derick> I think its messy now because we're all rewriting and reformatting it.

[10:52:45] <remonvv> kali: How did we get to this?

[10:52:52] <remonvv> kali: Oh yes, upserts

[10:54:17] <remonvv> Derick: Undoubtedly but I don't think it's moving in the right direction, nor is the portal layout itself very practical.

[10:54:52] <Derick> I can't say that I can find anything easily... so I see your point.

[10:55:04] <Derick> remonvv: If you have any specific points, I'd be happy to pass them on though!

[10:55:22] <remonvv> Derick: And regardless, developers shouldn't be exposed to the process of documentation and manuals moving from one iteration of it to the next. Should happen internally and released when it's done.

[10:55:40] <remonvv> Derick: It's hard to get really specific but it's a generally frustrating experience to look for specific information.

[10:57:26] <remonvv> Derick: I think the main issue is the structure. The hierarchy seems to be focussed on dividing section in the most appropriate leaf in the documentation tree. That shouldn't be the "entry point".

[10:57:51] <Derick> remonvv: don't understand that...?

[10:58:12] <remonvv> Derick: That's not a reflection on you. I'm explaining badly ;)

[10:59:23] <Derick> :-)

[11:00:31] <remonvv> Derick: Basically, the documentation is structured in a way that seems to answer "How do we put all information in an as clear as possible hierarchy" rather than "With what questions would people arrive to this page"

[11:01:29] <remonvv> Derick: Basically, people come to such a site with a question like "How do I do X" and the current site doesn't really facilitate answering that question.

[11:03:41] <remonvv> Derick: Example : "DB admin wants to add a shard to a cluster" and "Developer wants to know what an appropriate shard key might be". Two completely unrelated questions and two complete different roles but they'd both click "Sharding" in the menu now.

[11:06:53] <daslicht> hi

[11:06:57] <remonvv> Hi!

[11:07:00] <daslicht> please have a look on this:

[11:07:01] <daslicht> http://pastie.org/5959044

[11:07:02] <daslicht> :)

[11:07:16] <daslicht> seams to be that i have set my ensure Index wrong

[11:07:29] <remonvv> Why?

[11:07:33] <daslicht> any suggestion how i get the email property unique

[11:07:34] <daslicht> ?

[11:07:40] <remonvv> It is

[11:07:51] <remonvv> "unique" : true

[11:07:53] <daslicht> hm but i can add multiple users with the same email

[11:08:01] <NodeX> dropDups

[11:08:18] <remonvv> Yeah, did you not start with an empty collection?

[11:08:24] <NodeX> and in your example you want "user.email"

[11:08:34] <daslicht> aha

[11:08:38] <remonvv> NodeX <3

[11:08:42] <daslicht> i try

[11:08:42] <NodeX> db.foo.ensureIndex({"user.email":1},{unique:true});

[11:08:49] <NodeX> remonvv \o

[11:08:53] <ron> remonvv: your queen retired!

[11:09:40] <daslicht> thanks i try

[11:09:44] <daslicht> brb

[11:17:55] <daslicht> hm i can still add duplicates

[11:18:25] <NodeX> pastebin db.foo.getIndexes();

[11:19:25] <Zelest> What is the minimum amount of servers required for sharding?

[11:20:18] <NodeX> 2

[11:20:25] <NodeX> well 1 really

[11:20:36] <NodeX> you can shard on 1 box with 2/3 instances

[11:21:16] <Derick> Zelest: production or play? :)

[11:21:19] <daslicht> http://pastie.org/5959561

[11:21:53] <daslicht> i tried settingth index twice since teh fisrt attempt did not work

[11:22:10] <Zelest> Derick, somewhere in between.. production of a hobby project. ;)

[11:22:33] <Derick> daslicht: you want to drop the blogs.user.email one then

[11:22:55] <Derick> daslicht: and the duplica check is only per *document*

[11:23:05] <Derick> you can still add two emails with the same value in *one* document

[11:23:26] <daslicht> hm how do i slove it ?

[11:23:39] <Derick> if you want to prevent that, you need to store users in their own collection

[11:24:04] <NodeX> or dupe the data with an "emails" :[] and add the email to that too

[11:24:05] <Derick> I don't see why you have *one* document with the _id of blogs? Don't you just want a collection called blogs?

[11:24:12] <daslicht> ok i have less influence on the structure since its readed but derby

[11:24:35] <Derick> I've no idea what derby is

[11:24:38] <daslicht> i keep trying

[11:24:57] <daslicht> its a webframework based on expressjs

[11:26:00] <daslicht> so which structure should i create , in order to get unique emails ?

[11:26:53] <daslicht> instead of storiung my users in a array , store them as independent docs?

[11:27:09] <Derick> yes

[11:27:48] <daslicht> ok let me try something :)

[11:28:15] <daslicht> if that do not work i implement my own database persistence

[11:28:26] <megawolt> i'm getting "Unable to connect to a member of the replica set matching the read preference Primary" Error 1 of 3 times.Could anyone help?

[11:32:27] <Derick> megawolt: post your connection code snippet, and the output of "rs.config()" online somewhere

[11:35:58] <megawolt> Derick : http://paste.org/60819

[11:36:53] <Derick> megawolt: can't view that anymore. They require me to not have an adblock

[11:37:13] <Derick> (and I'm not turning it off)

[11:37:21] <Derick> different pastebin please

[11:38:38] <daslicht> its working!

[11:38:49] <Derick> good :-)

[11:39:06] <daslicht> http://pastie.org/5959803

[11:39:12] <daslicht> i have this structure now

[11:39:36] <Derick> megawolt: try pastie.org (with an i)

[11:39:59] <daslicht> no captcha terror there :)

[11:41:25] <megawolt> http://pastie.org/private/y6a0ysmzrnkk60wrsvqy1a

[11:43:20] <Derick> megawolt: Nesine.PL.Web.AppSettings.MongoDBConnString << what is that set to?

[11:45:41] <megawolt> mongodb://**.**.***.***:27017,**.**.***.***:27017,**.**.***.***:27017/?replicaset=RSNSN&readPreference=primary&connectTimeoutMS=60000

[11:46:29] <Derick> megawolt: you use IP in there? Your replicaset config doesn't

[11:46:51] <daslicht> arg still adding duplicates

[11:47:13] <daslicht> how can i check if my esure index unique is set correct ?

[11:47:16] <Derick> megawolt: you need to use the consistent names for them in both places

[11:47:22] <Derick> daslicht: pastie it again? It looked ok

[11:47:44] <daslicht> ok , let me first try it with a blank db

[11:48:31] <megawolt> Derick: I have changed with machine names but in that time i have more connection problem from stage machine

[11:49:02] <Derick> megawolt: you need to do either option. Either IPs every where, or names

[11:49:04] <megawolt> with IPs working great in Stage the problem occuring in my local

[11:49:14] <Derick> also make sure all your DNSes know the correct IP to go with them

[11:53:26] <megawolt> i have corrected cnnstr with names and pinged names

[11:54:04] <megawolt> stage working good but my local has the same problem

[11:55:27] <Derick> megawolt: I suggest you write to the google groups list (mongodb-user) - I don't think the c# people hang out here and I'm at a loss now

[11:58:08] <daslicht> ok now i have the following added

[11:58:09] <daslicht> http://pastie.org/5960081

[11:58:11] <megawolt> Derick: OK. Thank for helps.

[11:58:18] <daslicht> thast the stucture which will be created

[11:59:28] <daslicht> still strange somehow

[12:00:26] <NodeX> daslicht : that's a weird structure

[12:00:30] <daslicht> yeah :)

[12:00:46] <NodeX> {_id:ObjectId(.....), name:"",email:"",foo:""}....

[12:00:48] <daslicht> what is that uuid i used , is that the document name ?

[12:00:53] <Derick> yes

[12:01:00] <daslicht> i cant define that there

[12:01:02] <Derick> but why is name, email nested still?

[12:01:18] <Derick> daslicht: sounds like you're fighting against your ORM - I suggest you ditch it :-)

[12:01:24] <daslicht> looos like that is wropper is somehow broken

[12:01:26] <NodeX> +1

[12:01:30] <daslicht> or i think wroing

[12:01:56] <daslicht> its a model persistence thinggy

[12:02:25] <daslicht> https://github.com/codeparty/racer-db-mongo

[12:02:26] <daslicht> this one

[12:02:35] <NodeX> you should drop the mapping and deal directly with the db

[12:02:50] <NodeX> you will 1. learn more about mongo and 2. have a lot more power over your data

[12:03:04] <Derick> ++

[12:03:55] <daslicht> i diod thgat

[12:03:59] <daslicht> did taht before

[12:04:08] <daslicht> with expressjs and mongo native

[12:04:24] <daslicht> i just wanted to give it a try

[12:06:09] <daslicht> lol

[12:06:24] <daslicht> that db-race-mongodb is a wrapper of a wrapper

[12:06:41] <daslicht> db-race-mongodb uses mongoskin which is a wrpapper of mongo native

[12:06:42] <daslicht> haha

[12:06:54] <daslicht> ok that screems for errors

[12:28:11] <haxplorer> I'm trying to build mongo with v8. Mongo version 2.2.1, v8 from trunk. I used the I_know_I_should_build_with_GYP=yes option in v8 to build it with scons. Mongo detected the v8 libraries while configuring. But while compiling, I get this error - http://pastebin.com/AzHTXy31

[12:28:29] <haxplorer> What am I doing wrong?

[12:38:20] <Zelest> If I plan to store a lot of websites in mongodb (as in, raw html files), is a document field or gridfs the preferred way?

[12:41:07] <remonvv> ron: I know dude. Tears were shed.

[12:42:50] <remonvv> Zelest: Probably not. Why are you storing HTML files in a database?

[12:43:57] <Zelest> remonvv, writing a crawler and wish to store the raw data.

[12:49:16] <remonvv> Zelest: Alright. Well GridFS is specifically designed to store documents that exceed the single document size constraints.

[12:49:35] <remonvv> Zelest: It shouldn't really be used as a generic file storage solution. As such you might as well store the HTML in normal documents.

[12:49:49] <Zelest> Ah

[12:50:01] <remonvv> Zelest: That said I'd still prefer a solution where the HTML is stored on a file storage solution and you place the queryable metadata for each document in MongoDB

[12:50:33] <remonvv> e.g. Adding a document means uploading it to S3 or something, parse and extract metadata, store that in MongoDB with the S3/CF URL in the document.

[12:50:46] <remonvv> Best of both worls.

[12:50:48] <remonvv> worlds*

[12:50:56] <Zelest> :o

[12:51:01] <Zelest> sounds slow :P

[12:51:16] <remonvv> Which bit of it is slow?

[12:51:28] <Zelest> uploading it to S3

[12:52:01] <remonvv> Storing 1MB of data in MongoDB isn't going to be significantly faster than S3. That said S3 as a much larger per upload time overhead ofcourse.

[12:52:16] <remonvv> But the crawling is async so that's managable. If you need it to be faster use more threads to upload.

[12:52:59] <remonvv> Anyway, storing static data that you'll never query against seems unpractical. Might be a good first step though.

[12:58:32] <NodeX> storing eggs is awesome

[12:58:43] <NodeX> I dont have to shell out on expensive hardware

[12:59:01] <NodeX> it's an eggciting new technology

[13:08:52] <haxplorer> v8 works well with the development version which is the HEAD of master in github

[13:09:50] <haxplorer> Does anyone know when 2.4 is going to release? Is the roadmap timeline in https://jira.mongodb.org/browse/SERVER the release timeline too?

[13:10:27] <haxplorer> Eager coz, 2.4 would have v8 as the default engine

[13:30:58] <lar1914> learning about shell scripts... Is there a way to NOT echo result of cmd? I only wanna print when I say print('adsf')

[13:31:08] <lar1914> @echo off :)

[13:37:58] <daslicht_> ok i suspend my investigations on derbyjs

[13:38:01] <daslicht_> to weird

[13:38:14] <daslicht_> its datastructure is strange

[14:08:09] <remonvv> Someone downvotes my SO answer. I want to hunt this person down and shake my fist at him or her in an angry fashion!

[14:10:07] <HardFu> how can i query between values of two field

[14:10:28] <HardFu> field1: 1, field2: 3

[14:10:39] <HardFu> field1:3, field2:5

[14:10:46] <HardFu> field1:5 field2:7

[14:10:54] <HardFu> now let's say I have number 4

[14:11:08] <HardFu> I want to get second row

[14:11:13] <HardFu> how could I write the query?

[14:13:04] <NodeX> $gte : 123, $lte : 123

[14:13:35] <NodeX> field1: {$gte : 123}, field2:{$lte:123}

[14:14:00] <HardFu> i only have number 4 at my disposal

[14:14:26] <HardFu> I don't actually know values of field1 and field2

[14:14:40] <NodeX> good luck with that ;)

[14:16:13] <HardFu> this is analogous to mysql's BETWEEN operator

[14:17:06] <NodeX> not over 2 fields it's not

[14:17:30] <NodeX> between requires the same field

[14:18:06] <NodeX> if you know the upper bounds then it's easy but you say you don't so it's not going to be easy

[14:18:23] <remonvv> {field1:{$gte:4}, field2:{$lte:4}}?

[14:18:48] <HardFu> http://dev.maxmind.com/geoip/csv

[14:18:58] <HardFu> this is what I'm trying achieve

[14:19:27] <HardFu> I have the CSV in the database

[14:19:59] <NodeX> right, easiest way is to convert it to iplong

[14:20:11] <NodeX> and range it.... you do have the upper bounds

[14:21:06] <HardFu> I'll have a go

[14:21:07] <HardFu> thanx

[14:21:38] <NodeX> infact it's a reverse lookup, one sec

[14:23:43] <NodeX> can you pastebin db._your_collection.findOne();

[14:25:50] <HardFu> http://pastie.org/5962812

[14:28:44] <NodeX> yer, you need this .... db.foo.find({num_from:{$lte:12345},num_to:{$gte:123456}});

[14:28:57] <NodeX> and you'll need a compound index on num_from,num_to

[14:31:40] <HardFu> why the index?

[14:32:02] <ron> why like this?

[14:33:58] <NodeX> well if you want it slow then don't have one ;)

[14:34:54] <NodeX> and you should also loop to insert them as your num_from / num_to are inserted as strings not ints

[14:40:10] <HardFu> ah, lol

[14:40:16] <HardFu> that's what my problem was all along

[14:40:20] <HardFu> wrong data type

[14:42:49] <aster1sk> Greetings all - I'm having big problems with replication.

[14:42:54] <aster1sk> I've terribly slow replication - there's master / slave / arbiter configuration without write concern. On a sigle primary we do ~8000 upserts / second - on replication we're doing ~300

[14:48:02] <aster1sk> Any suggestions to remedy replication lag?

[15:05:57] <remonvv> What kind of writes? inserts or updates?

[15:06:42] <remonvv> Also, pastie mongostat --discover output and db.getReplicationInfo()

[15:06:48] <aster1sk> Updates

[15:07:53] <aster1sk> D.O.T creating pastie now.

[15:08:00] <remonvv> Sure the secondary has the indexes it needs?

[15:08:12] <remonvv> Mind you, they should be there automatically

[15:08:15] <aster1sk> Yes indexing is fine on both.

[15:08:24] <remonvv> Alright

[15:08:54] <aster1sk> So with safe writes on we don't see this issue however we're only updating ~1500 / second.

[15:09:04] <aster1sk> Where as with safe writes off we average around 8000

[15:09:30] <remonvv> Sustained?

[15:09:34] <aster1sk> @8000 updates per second though the secondary falls behind.

[15:09:36] <remonvv> Unsafe writes can be queued

[15:10:16] <aster1sk> http://pastie.org/pastes/5963880/text

[15:10:39] <aster1sk> Top is obvs secondary replacation and second block is primary.

[15:11:40] <remonvv> Both seem about equally fast. Where are you getting the 300/s from?

[15:12:43] <aster1sk> Background : we're migrating ~800 million records from SQL, I've scripted the migration and there are different update queries.

[15:13:09] <aster1sk> So with one of the migration operations we can get as low as 300 QPS which is unbearable on 800 million records.

[15:13:41] <remonvv> Any particular reason you're migrating into a replicaset instead of doing the import first and then sync a secondary?

[15:14:39] <remonvv> 8000/s still seems on the slow side of things for unsafe writes by the way.

[15:14:45] <aster1sk> I suggested that, apparently dot doesn't want to.

[15:15:04] <aster1sk> We've decided to use safe writes all the time now.

[15:15:36] <remonvv> Alright, and they're updates rather than inserts because the migration is more complicated than simply bulk inserting the data I assume

[15:15:46] <aster1sk> That's correct

[15:16:00] <remonvv> As in, you could generate large JSON files with all the migration steps performed and use mongoimport for a very fast import.

[15:16:06] <aster1sk> We've also tested only setting w : 1 every 10

[15:16:21] <aster1sk> That's clever, I should try this.

[15:16:44] <remonvv> That doesn't really do that much. unsafe writes simply skip waiting for a getLastError command

[15:17:22] <remonvv> You're not forcing a flush of the other 9 or something as far as I'm aware.

[15:17:25] <aster1sk> We run a lot of increments as well so not sure if dumping to fs would work.

[15:17:30] <aster1sk> No forced flush.

[15:17:55] <remonvv> Alright, no option to seperate the increments and other complicated updates from the main import?

[15:18:09] <remonvv> So, import the bulk data and only perform the complex updates on the database.

[15:18:33] <remonvv> Anyway, your original problem is odd. Same hardware and same setup should result in roughly similar performance for PRI and SEC

[15:19:31] <remonvv> aster1sk: Can you dump db.getReplicationInfo() for me?

[15:19:36] <remonvv> See what the oplog lag is and such

[15:19:41] <aster1sk> I thought so as well. Hardware specs are 8 cores, 16 gb / memory raid 10

[15:19:49] <aster1sk> Sec

[15:20:30] <aster1sk> I'm MITM without root so have to ask dot every time.

[15:21:30] <aster1sk> http://pastie.org/private/67pxj9ihlczk6yugvsgvg

[15:22:03] <remonvv> That's not practical at all is it ;)

[15:22:41] <aster1sk> It certainly is not.

[15:23:58] <remonvv> 2 seconds behind, that seems good. I'm assuming this is the safe writes option?

[15:24:13] <aster1sk> Correct.

[15:24:42] <remonvv> When you go for unsafe writes, does it start fast but suddenly degrades in performance?

[15:24:46] <aster1sk> Once every 10 writes has the 'w' flag on.

[15:24:55] <aster1sk> remonvv: precisely.

[15:27:13] <joniloky> Hi guys

[15:27:35] <joniloky> Got a question about mapreduce

[15:28:29] <joniloky> Is it possible to have MapReduce run on a secondary and somehow push the results to the primary

[15:28:57] <remonvv> aster1sk: It's hard to tell exactly what's going on from this. Might be that the unsafe writes are accepted faster than they can be written to the oplog on PRI. Altneratively the problem is on SEC side and there is some sort of bottleneck that makes it not process the oplog as fast as it's being filled. The former seems more likely. Do you see any hardware resource bottlenecks on SEC? (cpu, hd, etc.)

[15:29:13] <remonvv> joniloky, no.

[15:29:30] <joniloky> I'd like to prepricess data and I don't want the mapreduce to impact my performance

[15:29:39] <joniloky> remonvv: Any suggestions ?

[15:30:24] <remonvv> joniloky: Two: 1) That sort of performance scaling is best done through sharding, and 2) Look into the AF as a M/R replacement since that isn't single threaded and inherently faster and sharding compatible.

[15:30:26] <joniloky> remonvv: What is the "preferred" usage of MR to avoid performance degradation

[15:30:44] <joniloky> Ok

[15:31:00] <remonvv> joniloky: There is no way to use m/r without performance degradation. You're doing more work with the same resources. The question is not does it slow down but is it still fast enough for you.

[15:31:17] <remonvv> Full m/r is best implemented on seperate m/r instances through hadoop or similar.

[15:31:52] <remonvv> Aggregation Framework is faster and "better" but not quite as flexible.

[15:31:59] <remonvv> Yet (tm)

[15:32:23] <joniloky> remonvv: are there any plans to make it possible for m/r to run on a secondary(maybe even hidden) and push the results?

[15:33:24] <joniloky> Do you know?

[15:33:30] <aster1sk> There are zero hardware bottlenecks remonvv -- system is practically idle.

[15:34:55] <aster1sk> I'm suprised, disk io is 1% or so, memory is nil

[15:39:14] <remonvv> joniloky: It runs fine on a secondary if you really want to but you shouldn't. Secondary is used for replication and eventually consistent reads ;)

[15:39:32] <remonvv> aster1sk: Seems that PRI is the limiting factor then. Is that not busy either?

[15:39:47] <aster1sk> Neither of them are busy at all with safe writes.

[15:40:02] <remonvv> aster1sk: I do see 99%+ write locks on secondary

[15:40:04] <joniloky> remonvv: Yes, but on secondary i cant store to a collection

[15:40:24] <aster1sk> Yeah write lock is ridiculous, but disk io is 1% and memory is nil

[15:40:29] <remonvv> aster1sk: In the pastie I mean, but that's during a burst of 4-6k writes sec, not the 300 you mention

[15:40:38] <remonvv> which db version is this btw?

[15:40:53] <aster1sk> I think we're past the 300 / sec issue now, we've modified the migration script.

[15:41:21] <remonvv> joniloky: no, you'd have to do some hacking to get that data back to primary. It's not something I'd even look into further. Assume it's not possible.

[15:41:33] <remonvv> aster1sk: Okay, so now it's just kinda slow ;)

[15:42:09] <remonvv> aster1sk: Still, at this rate 800 million is going to take a bit and performance will degrade slightly as collections grow.

[15:42:29] <aster1sk> Yeah, during testing we were able to migrate the lot of it in about 4 days.

[15:43:07] <remonvv> aster1sk: It'll go a lot faster if you use sharding.

[15:43:24] <remonvv> aster1sk: Other than that I don't have much suggestions for you if migrate first, sync second isn't an option

[15:46:25] <aster1sk> remonvv: yeah I can't seem to figure this one out - we have a strict launch date but this probably means we can't make the deadline.

[15:47:17] <remonvv> aster1sk: Well, unsafe writes to a PRI and then sync will be significantly faster. Or if you want to be super fast you can simply manually copy the data files of PRI to SEC which should be pretty much an insta sync.

[15:47:55] <remonvv> aster1sk: http://docs.mongodb.org/manual/administration/replica-sets/#resync-by-copying-data-from-another-member

[15:54:55] <BadCodSmell> How can I detect when mongo has finished booting in bash?

[15:55:00] <BadCodSmell> since it returns immediately

[15:56:22] <BadCodSmell> ie it returns prior to daemonisation

[15:56:25] <BadCodSmell> a long time prior sometimes

[15:57:57] <aster1sk> remonvv: you've been most helpful - I do believe this is the route we will take

[16:01:09] <remonvv> aster1sk: You're very welcome, good luck ;)

[16:01:59] <remonvv> BadCodSmell: Not sure there's an easy solution. You can look for "waiting for connection on port " string in the log

[16:16:29] <NodeX> modify the init.d script to wait ;)

[16:33:08] <BadCodSmell> remonvv: That raises another question, wtf does anyone use mongo?

[16:34:59] <remonvv> BadCodSmell: First "w" is "why" I assume?

[16:35:34] <NodeX> BadCodSmell : wtf not ? ;)

[16:35:54] <remonvv> BadCodSmell: It's relatively simple. If you think of problems that require a persistence solution as different types of dinner then MongoDB is like mayo.

[16:36:11] <NodeX> blog it^

[16:36:12] <remonvv> BadCodSmell: Good with some dinners but not all of them so you have to know which dinner becomes more tasty with mayo

[16:36:42] <NodeX> but in Belguim mayo goes with everything hahaha

[16:37:09] <remonvv> BadCodSmell: And then there's a bunch of barely informed critics that hate mayo just because other people hate mayo and/or because they don't like the look of it even though they've never tasted it.

[16:37:20] <NodeX> +1

[16:37:21] <remonvv> I need to use this mayo analogy more often. It's bulletproof

[16:38:49] <NodeX> in the uk that would mean "It's Mustard"

[16:39:09] <NodeX> "It's mustard" == "It's sh*t hot" / "really good"

[16:39:19] <remonvv> Them UK people...

[16:39:29] <NodeX> bunch o' weirdo's if you ask m

[16:39:31] <NodeX> me *

[16:39:37] <remonvv> Amen sir, amen.

[16:39:47] <NodeX> think I finaly got my migration done#

[16:40:02] <NodeX> 14 apps, 2 servers start to finish 13 hours

[16:46:54] <BadCodSmell> mongo is like a crude sauce that doesn't particular enhance any dish that comes to mind

[16:47:11] <BadCodSmell> a watery gravy for example

[16:49:00] <NodeX> then dont use it - simple problem solveds

[16:49:01] <NodeX> -

[16:49:03] <NodeX> -s

[17:38:02] <remonvv> BadCodSmell: This is much like stepping into an Apple store and starting to pronounce that iPads suck. A high troll factor.

[17:44:54] <JoeyJoeJo> Does the driver I'm using have any affect on the speed of inserts or queries? For example, if I write a client in python is it any faster or slower than a similar client written in C?

[17:47:42] <JoeyJoeJo> If some drivers are faster, is one known to be the fastest?

[17:47:42] <MatheusOl> JoeyJoeJo: It could be if the overhead on the app-side is too huge

[17:48:25] <MatheusOl> Perhaps the way the drivers serialize objects to BSON format can make some difference on speed

[17:48:40] <MatheusOl> But I don't know what is the fastest one

[17:49:35] <JoeyJoeJo> That's good info though, thanks for your help

[17:50:30] <MatheusOl> You could create a benchmark, and show it to us

[17:50:31] <MatheusOl> =D

[17:51:10] <JoeyJoeJo> That would require me to know C, or any language that isn't python

[17:52:55] <matt__> sorry to bother everyone, but I wanted to know if and what is the mongo equiv of mysql's ~/.my.cnf to specify default datbase

[17:53:48] <matt__> sorry to bother everyone, but I wanted to know if and what is the mongo equiv of mysql's ~/.my.cnf to specify default datbase

[17:55:55] <tjgillies> is there a way to select a date range from mongo? like documents created within the last week?

[17:55:57] <MatheusOl> matt__: I think you want ~/.mongorc.js

[17:56:26] <MatheusOl> tjgillies: db.col.find({dt : {$gt: A, $lt: B}})

[17:56:34] <tjgillies> MatheusOl: thank you :)

[17:56:43] <MatheusOl> tjgillies: you're welcome

[17:57:18] <matt__> MatheusOl: perfect, thx

[17:57:38] <tjgillies> what are the little guys with $ called?

[17:58:41] <MatheusOl> Modifiers or operators

[17:58:46] <MatheusOl> tjgillies: ^

[17:58:46] <tjgillies> MatheusOl: thnx :)

[18:00:49] <MatheusOl> In this case they are operators

[18:00:58] <MatheusOl> Those used on update are modifiers

[18:06:23] <marktraceur> OK, I have a bit of a bend-over-backwards query to make. I want to add in a "sort-by" field to a collection, and I need to run some queries to add this field to existing documents. The problem is, this collection's documents are linked together in subgroups, so there will be many documents with sortkey of "0". Is there some way I can use aggregation parameters in an update query?

[18:06:48] <marktraceur> Basically my theory is that I can group by the subgroup ID, and then limit to one per group somehow.

[18:11:00] <marktraceur> Or maybe not? Maybe I'm forced to run a bunch of queries and do it all on a per-document basis? This wouldn't be all too much trouble, but it's obviously not ideal.

[18:13:11] <JoeyJoeJo> Assuming that my shard key is field "a" and I have fields "a","b", and "c", what happens when I do a query for field "c"? How does mongos know on which shard to look for field "c"?

[18:14:39] <JoeyJoeJo> Or I should say, if I search for {c:10}, how does mongos know on what shard to find documents where c == 10?

[18:14:40] <MatheusOl> marktraceur: the same subgroups should have the same sortkey values? Is that what you want?

[18:15:07] <MatheusOl> JoeyJoeJo: It will look it in all shards

[18:15:49] <JoeyJoeJo> So would an index on field "c" speed up that query?

[18:16:45] <marktraceur> MatheusOl: Each subgroup's sortkeys should be sequential. So I can sort each subgroup.

[18:17:03] <matt__> <MatheusOl> Do you know what I would add to mongorc.js to use a different database?

[18:17:14] <matt__> right now, when i load client it is using "test" db

[18:18:18] <marktraceur> MatheusOl: I'm starting to think that I'm doing this wrong, for Mongo.

[18:19:24] <marktraceur> Maybe it would be easier to start a "sorting" collection that kept arrays of ObjectIDs, indexed by subgroup IDs, that I can update with one query instead of three.

[18:21:45] <MatheusOl> marktraceur: I'm not sure I understood your problem

[18:22:00] <MatheusOl> matt__: You can do that in command line: mongo localhost/yourdb

[18:22:19] <marktraceur> MatheusOl: That's all right, I'll make do and if someone else has any other ideas I can try those :)

[18:22:22] <matt__> yeah i know, but i like to use config files so i dont have to type it every time

[18:22:35] <marktraceur> MatheusOl: Unless you'd like more details that might help explain

[18:23:18] <MatheusOl> matt__: at Linux?

[18:23:26] <MatheusOl> put an alias on your .bashrc

[18:23:26] <matt__> yeah

[18:23:35] <matt__> cool, that would work

[18:23:36] <matt__> thx

[18:23:47] <MatheusOl> or, at your .mongorc.js: db = connect('foo');

[18:25:28] <MatheusOl> marktraceur: I got it

[18:25:55] <MatheusOl> marktraceur: what is the order now? Random?

[18:26:22] <marktraceur> MatheusOl: Presumably they're ordered by ObjectId, but I really haven't tested that theory. I'd guess it's the order in which they're created.

[18:27:46] <marktraceur> So yeah, it's the order of creation.

[18:28:15] <MatheusOl> marktraceur: If you use the default ObjectId it is already the creation order

[18:28:20] <MatheusOl> marktraceur: so just sort by it

[18:28:42] <marktraceur> MatheusOl: That's what it is now, but I want it to be sortable by the users

[18:29:18] <MatheusOl> marktraceur: humm...

[18:29:29] <MatheusOl> is the collection huge?

[18:29:32] <marktraceur> MatheusOl: So I need to either add a new field, which means some complex update queries, or add a new collection, which seems a lot simpler.

[18:29:55] <marktraceur> MatheusOl: It's about 150 nodes, and there are about 7 collections of similar size that I need to do this for also.

[18:30:12] <MatheusOl> 150... so I assume it is huge

[18:30:27] <marktraceur> The subgroups range from having 1 to 30 nodes each. But yeah, it's not tiny.

[18:30:37] <marktraceur> Sorry, one node is one document.

[18:30:57] <MatheusOl> In this case, why don't you use ObjectId and add the sortable field only when user request it

[18:31:09] <MatheusOl> so, how many nodes it is?

[18:32:01] <marktraceur> MatheusOl: How would I use the sortable field if only one node had it out of the subgroup? The one node would either be at the beginning or at the end, and not a lot I could do about it.

[18:33:35] <MatheusOl> Wait a sec. What is node for you? A shard? Or a subgroup?

[18:33:44] <marktraceur> MatheusOl: It's a document. Sorry.

[18:34:16] <marktraceur> I'm working on this: http://orgcharts.wmflabs.org:8888/#5085aa408fedf26b68000001/

[18:35:48] <MatheusOl> marktraceur: I have no idea what it is

[18:35:58] <MatheusOl> marktraceur: but each node (on the picture) is a document, right?

[18:36:05] <MatheusOl> marktraceur: and they are all on the same collection

[18:36:16] <marktraceur> Yup

[18:37:07] <marktraceur> And I want to be able to change the visual order of the nodes. But it would mean changing the database somehow and I'm not sure of the best method.

[18:40:56] <MatheusOl> humm... So it's not huge... =D

[18:41:31] <MatheusOl> It would be easier to navigate on each document and add the field

[18:42:18] <marktraceur> Yeah, I figured

[18:42:32] <marktraceur> Then that is what I shall do! Thanks!

[18:42:38] <MatheusOl> =D

[18:43:57] <marktraceur> I'll also lurk a bit, maybe I can give back at some point.

[18:52:01] <Dededede4> hello

[18:53:31] <Dededede4> Mongodb requires a 64-bit architecture to manage a database of more than 2 gigabytes. It also works with PowerPC64?

[18:55:03] <vr_> I have a db w/ roughly 150K documents in them. There's a field that I'd like to do a sort/limit to get top-100 or so of.

[18:55:33] <vr_> Doing this via mongodb was painfully slow but just grabbing those fields and doing it in a scripting language took no time @ all.

[19:03:28] <ameoba> Dededede4: never heard anything about an x86 requirement for it. They don't have binary downloads so you'd have to compile it yourself.

[19:07:21] <Dededede4> Yes, but mongodb Will he be able to manage a database of more than 2 gigabytes?

[19:08:24] <ameoba> Dededede4: should

[19:08:40] <ameoba> Dededede4: ...as long as you're running a 64-bit OS

[19:12:17] <Dededede4> ok

[19:27:34] <owen1> using the Node.jsnative driver, how to make my secondaries readable? (2.0.1)

[19:36:36] <tomlikestorock> my previously fully operational replica set now has what seems to be a rogue slave, which only reports "loading local.system.replset config (LOADINGCONFIG)" when I try to bring it back up

[19:36:41] <tomlikestorock> 2.2.0

[19:39:37] <tomlikestorock> and the master reports this slave as "DOWN" and "still initializing"

[19:40:59] <tomlikestorock> I can connect to mongo from each opposite server, but I can't get this replica set back up. I've logged onto the slave, deleted the repl config, brought down the slave, deleted the replicated datafiles, and brought the system back up. After it replicates, the problem reoccurs

[19:43:28] <tomlikestorock> what gives?

[19:44:18] <JoeyJoeJo> Is there a command that will tell me how long the previously run query took? Or is there something I can add to a find() query that will tell me how long it took?

[19:51:02] <owen1> is it possible for a client to access the nearest mongo host in a replicaset (read freference = near) in version 2.0.1?

[21:26:32] <svm_invictvs> heh, there isn't much to gridfs is there?

[21:33:36] <KeepSafe> Hi, what would be a query that would return documents containing a value in at least one attribute be?

[21:35:10] <ron> an insane one.

[21:36:21] <KeepSafe> ron: would I have to perform a query for each attribute then and create a set in my application?

[21:36:57] <ron> you can create a single query, but you'd have to $or between all options.

[21:37:22] <KeepSafe> ok

[21:37:34] <ron> it basically suggests that your data model may not be optimal.

[21:39:48] <mr_smith> depends how often that query is being executed.

[21:40:48] <KeepSafe> ron: it was going to be site-wide searching, blog posts, accounts, etc.

[21:41:09] <ron> smells like an external index to me.

[21:41:28] <KeepSafe> ron: ?

[21:41:36] <ron> though like mr_smith said, it also depends on how often the query is being executed.

[21:42:10] <KeepSafe> ron: it could be executed quite frequently

[21:42:44] <ron> KeepSafe: well, without more details it would be difficult to say.

[21:43:26] <mr_smith> everything in nosql subjectively awesome or awful depending on the usage and use case. track the execution and response times and fix it if it's a problem.

[21:44:01] <KeepSafe> ron: the search would search attributes fo user accounts (email/name) as well as all blog posts (title/taglist/author)

[21:45:36] <ron> search for what?

[21:46:14] <ron> if you're searching in a title, it sounds like you need some form of full-text search.

[21:47:03] <KeepSafe> ron: a string would be provided, and the fields would be checked for the values

[21:47:24] <KeepSafe> I was planning on using icontans

[21:47:31] <ron> ah?

[21:47:34] <KeepSafe> icontains *

[21:50:21] <ron> I'd seriously consider looking into an external indexer in that case.

[21:50:48] <mr_smith> like elasticsearch?

[21:51:24] <ron> like elasticsearch. or solr. or lucene. or sphinx. or whatever jiggles your bell.

[21:51:41] <cr3> hi folks, I'm trying to use mongoexport --db foo --collection bar | mongoimport --db foo --collection bar --drop, but now my ruby application is reporting errors because it seems that symbols aren't being imported properly. does that ring a bell for anyone?

[21:54:38] <cr3> aha! there's a big disclaimer at the top of the mongo exporting and importing part of the manual that says mongoexport will lose some information when converting from bson to json. so, can someone recommend a way to export the data in a way that I can modify it in a text editor before importing it?

[22:44:45] <sg> So I have a collection of documents where each document has a created_at attribute, and I need to do a select by DATE, i.e. select every document where create_at is a Monday, etc.

[22:44:50] <sg> er, by DAY*

[22:45:22] <sg> Anyways, it looks like there's no efficient way to do this, and that I am stuck with having to use $where: function () { this.created_at.getDay() == 1 }

[22:46:11] <sg> I have to do this a number of times, so I'd like to iterate over every document in this collection (over 39mil of them) and just add a new field called day_of_week or something and update said field with the day of the week, that way I can index it for performance later on

[22:46:22] <sg> Is there any way to do that

[23:02:41] <fommil> hi all – I've been getting "cursor 0 not found on server localhost" using the java driver. Some googling shows that I need to turn off the timeout – is there a way to do this through MongoOptions?

[23:05:00] <fommil> never mind, found maxWaitTime

[23:05:07] <sg> also my aggregation is not working

[23:05:13] <sg> db.tweets.aggregate( { $project : { text: 1, day: { $dayOfWeek:"$created_at" } }, $limit : 5 } );

[23:05:27] <sg> it's acting as if i had done db.tweets.find() with a limit of 5.

[23:05:28] <sg> i don't understand

Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 30th of January, 2013