pmxbot IRC Log Viewer

[00:20:56] <ToeSnacks> I have hit a namespace limit on my 2.2 environment and have added -nssize 256 to the end of the init script. What should I now do to allow the affected member to rejoin the replica set?

[00:33:53] <joannac> repair?

[00:38:32] <joannac> ToeSnacks: yeah, repair and then let it rejoin

[00:38:44] <joannac> although I'm confused why you only hit this on a single secondary...?

[02:44:12] <delinquentme> Is there a way to verify that the STRUCTURE of a DB entry in mongo is sound?

[02:44:40] <delinquentme> we've got a working login system but anything which is hitting a particular collection is making the app take a pew

[02:56:48] <joannac> what do you mean by "a DB entry"?

[05:02:41] <pratikbothra> Is anybody here? :-)

[05:15:44] <sflint> pratikbothra: whats up

[05:16:38] <pratikbothra> sflint: Hello

[05:16:47] <sflint> hi

[05:21:10] <sflint> pratikbothra: did you have a question?

[05:23:35] <pratikbothra> @sflint - Sorry, I got disconnected.

[05:23:42] <sflint> all good

[05:24:03] <pratikbothra> How do you handle updating populated objects in mongodb? Take the use case of likes in post (likes: [{type: Schema.ObjectId,ref: 'User'}])

[05:24:10] <pratikbothra> While retrieving, I use populate and get the user name, link along with their id's. But on updates back to mongodb, I'm facing a lot of trouble as mongodb is expecting an array of id's, while I'm sending it populated objects back. What is the best way to handle this?

[05:25:22] <pratikbothra> Is there a mongodb operation that I am not thinking of while updating?

[05:25:52] <pratikbothra> On post update, I do something like this var post = req.post; which is followed with _.extend(post, req.body); ....Now imagine handling post.likes (both likes being added and people removing their likes), its so dirty

[05:26:42] <pratikbothra> sflint: How would you do something like this?

[05:28:23] <sflint> no sure i understand the issue. What doesn't you document look like? you just want to keep could of likes on a post?

[05:28:31] <sflint> what does*

[05:29:02] <sflint> to update a document you can just use 'db.collection.update({<query>},{<update>})'

[05:29:56] <sflint> are you trying to update multiple documents at once?

[05:30:44] <sflint> can you just tell me on the most basic level what you are trying to track? what data needs to be stored?

[05:32:46] <sflint> pratikbothra: you still there?

[05:32:57] <pratikbothra> Yup, there

[05:33:26] <pratikbothra> Its a social colloboration site

[05:33:39] <pratikbothra> where people can like and then unlike posts

[05:34:46] <tyteen4a03> can I rely on mongodb generating document _ids for use in applications?

[05:35:09] <pratikbothra> @tyteen4a03 - of course

[05:35:39] <tyteen4a03> great, so I don't have to create my own field

[05:36:22] <pratikbothra> sflint: But at the moment we had a consolidated update function for posts.

[05:36:57] <sflint> pratikbothra: so you need a to $inc :1 and $inc : -1

[05:37:06] <sflint> on each post

[05:37:20] <pratikbothra> So in posts document, the title, body, any of its attributes could be updated. And the line _.extend(post, req.body); worked well.

[05:37:59] <pratikbothra> In a typical post document likes could be ['53cec89600c915dfcd0231d0', 53cec89600c915dfcd023123', '53cec89600c915dfcd023122']

[05:38:18] <pratikbothra> where each of the 3 is basically a reference to the user object.

[05:39:23] <pratikbothra> The problem being that when the data is retrieved we do a .populate('likes', 'name username picture')

[05:39:23] <sflint> pratikbothra: okay

[05:40:16] <pratikbothra> sflint: Is the problem statement more clear?

[05:40:55] <sflint> not really...still don't see a mongo question yet?

[05:41:06] <sflint> i can't tell you what your code is doing in that call

[05:41:20] <pratikbothra> Do I have a separate route for like and unlike , something like postId/like/userId and for unlike postId/unlike/userId ?

[05:41:27] <pratikbothra> Or is there a way in the update function

[05:41:53] <pratikbothra> for me to tell mongo to automatically convert all these objects to an array of ids before saving it?

[05:41:59] <sflint> having a sperate route is up to you for the api

[05:42:22] <sflint> you can't tell mongo to convert anything it is a database

[05:42:27] <sflint> it takes what you give it

[05:43:34] <tyteen4a03> pratikbothra, and _id is an UUID, right?

[05:43:45] <sflint> you can use the $in :[]

[05:43:50] <sflint> for an update

[05:44:22] <pratikbothra> sflint: $in to check for the object id's which have liked it?

[05:44:50] <pratikbothra> tyteen4a03: well, sorta. It also gives you the timestamp. What are you exactly looking for?

[05:45:21] <tyteen4a03> pratikbothra, I need UUIDs for every document; that's it

[05:45:26] <davasaurous> Is there a go-to way of configuring the collections of a new MongoDB DB post-deploy? I'm looking for a script or the like for setting up uniqueness constraints and indices and the like

[05:45:39] <sflint> pratikbothra: $in allow you to say "update all document that match" $in : [ <id>,<id>]

[05:45:44] <pratikbothra> tyteen4a03: that works for sure. It is always going to be unique.

[05:46:16] <sflint> davasaurous: you can script it...or just do it in your code....it is all dynamic

[05:46:42] <pratikbothra> You can use the migrate module

[05:46:46] <pratikbothra> npm install -g migrate

[05:46:51] <pratikbothra> and set up migration scripts

[05:46:56] <pratikbothra> which add indexes and so forth

[05:47:16] <pratikbothra> https://github.com/visionmedia/node-migrate

[05:48:05] <sflint> you can add and index with db.collection.ensureindex({<key>:1})

[05:48:17] <davasaurous> We're not using node and I'd rather not install it just for migrations, thanks for the tip though pratikothra we use node for other projects

[05:48:33] <sflint> davasaurous: you can use the shell

[05:48:50] <sflint> use javascript to issue the indexes

[05:49:01] <sflint> i can write the script if want?

[05:49:04] <davasaurous> sflint I know, I'm more looking for a nice way to have an automated series of those run when we initialize a machine or run our server

[05:49:08] <pratikbothra> +1, just the shell....And db.collection.ensureindex ....Mongodb must be v2.6 though

[05:49:27] <sflint> davasaurous: you only need indexes once?

[05:49:32] <sflint> why do you need to run it more than once?

[05:49:47] <sflint> pratikbothra: ensureindex works on all version of mongo

[05:49:55] <davasaurous> Staging, Testing, Production, changes to indexing requirements

[05:50:26] <sflint> davasaurous: you would have to edit anything that you used. Just as easy to have a indexes.js, indexes-qa.js

[05:50:40] <pratikbothra> My bad, was thinking of text search which works 2.6 up though.

[05:51:52] <sflint> davasaurous: i don't know if there are new collections, but you don't want to create indexes on the fly usually...there is a cost of locking associated. You can use the {backgroud:1 }

[05:52:32] <sflint> pratikbothra: still not sure what you are trying to do, but likes and unlikes should be easy to handle in mongo.....you could use a serperate route/resource for each, but mongo doesn't care

[05:55:04] <sflint> 'db.collection.update({_id : $in : [<id>,<id>,<id>]},{$inc : { like : 1 , unlike : 1})

[07:13:02] <BurtyB> hmm these collapsing headings on the website are driving me nutts which is a bad thing this early in the morning :/

[07:15:16] <rh1n0> Are there scripts that can be run against mongodb to report on missing indexes (similar to scripts for mysql or postgresql for example that do the same thing?) - been looking for them :)

[07:16:37] <BurtyB> not that I know of but that means nothing :)

[07:16:59] <rh1n0> oh well thanks any way lol - im a bit newish to mongodb

[07:17:15] <BurtyB> someone might come to life that does :)

[07:17:21] <rh1n0> we have a query that takes 25 seconds to run - ugh

[07:17:38] <rh1n0> im trying to decipher it and verify we have indexes created correctly

[07:18:20] <Boomtime> have you run a sample slow query with .explain()?

[07:18:39] <Boomtime> that will at least tell you which index a query used

[07:18:50] <rh1n0> I was looking at that in the documents just now actually

[07:20:13] <sflint> rh1n0: nope

[07:21:02] <sflint> rh1n0: what does you schema look like? and query pattern? i can tell you what index to use

[07:24:04] <sflint> rh1n0: a good rule is if you do db.collection.find().explain() and you see "Basic Cursor" you need an index. But to build to scale you want to use "indexOnly" queries

[07:24:22] <rh1n0> im very new to this stuff. I have a query that looks like this paste http://pastebin.com/4dbpFtJt

[07:24:41] <rh1n0> very good info - thank you for that

[07:25:57] <rh1n0> oh yeah i see BasicCursor heh

[07:27:17] <sflint> rh1n0: why are you using aggregation framework? Is there some count you are looking for?

[07:27:31] <rh1n0> no idea - i didnt write it :)

[07:27:49] <rh1n0> How can i get rid of it?

[07:28:05] <sflint> db.collection.find

[07:28:13] <rh1n0> ah ok

[07:28:16] <sflint> instead of db.collection.aggregate

[07:28:26] <sflint> just depends on what you are trying to query

[07:29:16] <rh1n0> does the aggregate framework have a performance cost associated with it?

[07:30:11] <sflint> rh1n0: yes

[07:30:43] <sflint> rh1n0: but it is more flexible...you can pipe a set to another action in aggregation

[07:30:54] <sflint> you can also sum and return an answer

[07:31:12] <rh1n0> i see

[07:31:18] <sflint> with a find() you return a cursor to the applicationa and would need to do the aggregation in your app

[07:32:22] <rh1n0> thanks very much for the info - good stuff

[07:32:42] <sflint> np...anytime

[07:32:48] <sflint> up doing maint work anyway

[07:33:28] <rh1n0> im usually doing ruby and cloud automation stuff - today i got de-railed into mongodb - fine by me. Always wanted to work with it more.

[07:35:20] <rh1n0> im really blown away by how good the mongodb docs are

[07:35:44] <sflint> rh1n0: yeah....they are for some things

[07:36:10] <sflint> when you get into mongo deep there is a lot missing in the docs

[07:36:41] <rh1n0> i tried to remove the aggregate but attempting to run the query gives me the error: "$err" : "Can't canonicalize query: BadValue unknown operator: $match"

[07:36:56] <sflint> $match is a find

[07:37:04] <sflint> but for aggregation

[07:37:06] <rh1n0> oh!

[07:37:11] <sflint> you can piple that to something else

[07:37:15] <sflint> pipe*

[07:37:54] <sflint> and the skip,limit are a little different

[07:38:05] <sflint> db.collection.find().skip().limit()

[07:38:36] <rh1n0> should i be using this format instead? db.collection.find( { field: /acme.*corp/i } );

[07:39:38] <rh1n0> as i look thru the mongo docs i dont understand why this query was written this way :)

[07:41:04] <sflint> yeah....for regex

[07:41:22] <sflint> or you can use $regex()

[07:41:25] <sflint> either work

[07:41:43] <sflint> why the regex at all? isn't it looking Texas?

[07:42:02] <rh1n0> this is an specific use-case

[07:42:20] <rh1n0> its a form where the user can select city, state, zip

[07:42:41] <rh1n0> i dont know why they are using regex lol

[07:43:02] <rh1n0> its one of those deals where the contractor wrote the code, then bailed.

[07:43:06] <sflint> you should validate that server side...and make the client only about to choose "Texas"

[07:43:11] <sflint> nice

[07:43:18] <sflint> that happens

[07:43:32] <rh1n0> yes it does. i tend to end up on 'rescue projects' often

[08:10:09] <rspijker> rh1n0: (I only read the first prt of your query about looking for index suggestions) have you looked at Dex?

[08:10:20] <rspijker> http://blog.mongolab.com/2012/06/introducing-dex-the-index-bot/

[08:10:37] <rh1n0> no ill do that though - thanks!

[08:59:58] <inad922> hello

[09:00:31] <inad922> Is there a way to make a query against a field in such a way that I want to check if the string I'm querying the field with is a substring of the value of the field?

[09:00:44] <inad922> Or better yet can I do regex queries?

[09:01:16] <nfroidure_> inad922, there an operator for regexp searchs

[09:01:42] <inad922> nfroidure_: Could you give me a link to the appropriate mongodb documentation page?

[09:01:42] <nfroidure_> http://docs.mongodb.org/manual/reference/operator/query/regex/

[09:04:05] <inad922> nfroidure_: Thanks

[09:17:05] <Aartsie_> i will make a mongodump but get this strange error: db already exists with different case already have: [DBNAME] trying to create [DBNAME]", code: 13297

[09:17:11] <Aartsie_> can someone explain this for me ?

[09:21:40] <rspijker> Aartsie_: what does your mongodump command look like?

[09:22:00] <Aartsie_> mongodump -v --db DBNAME

[09:23:16] <f0ll0w> does any1 have some experience with 2-factor-commit?

[09:24:25] <rspijker> Aartsie_: well.. do you have a db named DBname, for instance?

[09:25:03] <Aartsie_> yes i have

[09:25:14] <rspijker> or are the things in brackets in your error _exactly_ the same?

[09:25:33] <Aartsie_> aah i just find out the problem i have a uppercase in the name

[09:25:35] <rspijker> case is important here

[09:46:30] <f0ll0w> does any1 wanna test my two-phase-commit code?

[10:12:28] <f0ll0w> who has experience with two-phase-commits?

[10:28:09] <tmh1999> Hi guys, I am new to mongodb and I am using java driver. I have this kind of error and I don't really know why.

[10:29:46] <tmh1999> when I insert a BasicDBObject with { "PATH" : "/" }. then I insert the same BasicDBObject with different key-val I get this exception com.mongodb.MongoException$DuplicateKey

[10:30:07] <tmh1999> It's is kind of a for-loop so I want to reuse the same BasicDBObject

[10:30:45] <rspijker> tmh1999: if you don’t explicitly specify an _id element, the driver generates one automatically

[10:30:53] <Derick> if you want to do that, you need to unset the _id key *i think*.

[10:30:58] <rspijker> my guess would be that it generates it when you create the DBObject

[10:31:32] <rspijker> you could just explicitly set the _id key to a UUID

[10:32:06] <tmh1999> so I should do like this : BasicDBObject myoj = new BasicDBObject("_id", "mykey") to solve ?

[10:32:19] <tmh1999> yeah I will try

[10:32:40] <tmh1999> Derick: how do I unset the _id key ?

[10:32:51] <Derick> I don't know that.

[10:33:03] <Derick> but there must be a way to unset a key in a basicdbobject

[10:33:32] <rspijker> there is a .removeField(String key)

[10:33:50] <Derick> it inherits hashMap, so it's unset should work I think

[10:34:04] <Derick> sorry, remove(): http://docs.oracle.com/javase/1.5.0/docs/api/java/util/HashMap.html?is-external=true#remove%28java.lang.Object%29

[10:34:09] <rspijker> tmh1999: just to be clear, setting it to mykey won’t work. You have to set it to a new value every iteration, that’s the whole point

[10:34:54] <rspijker> also, creating a new object every iteration shouldn’t really be such a huge deal…

[10:35:10] <rspijker> They are all different objects after all….

[10:35:43] <tmh1999> I am dealing with, like, 10-100 mil record. Is that efficient, in Java?

[10:36:00] <rspijker> GC should take care of it

[10:36:37] <rspijker> just found this in some example code:

[10:36:38] <rspijker> for(int i=0;i<10000000;i++){

[10:36:39] <rspijker> col.insert(new BasicDBObject().append("Value", (int)(Math.random()*79)).append("ID", UUID.randomUUID().toString()));

[10:36:40] <rspijker> }

[10:36:41] <rspijker> that runs fine

[10:36:58] <rspijker> (note that ID is not _id)

[10:37:06] <tmh1999> cool! I will try.

[10:37:52] <tmh1999> super thanks, rspijker and Derick

[10:38:00] <rspijker> good luck

[10:43:19] <tmh1999> rspijker: It worked. You rock !

[10:45:34] <uhrt> hi

[10:52:03] <sfix> hi guys, when creating a new user with the "readAnyDatabase" role, do I leave the database param blank or set it to "admin" ?

[11:00:03] <f0ll0w> ?

[11:00:05] <f0ll0w> ?

[11:00:06] <f0ll0w> ??

[11:05:02] <uhrt> so i’m using mongo’s text search in a python web app i’ve been building. Thing is, all the data’s in greek, but text search has got no support for greek. This is a bit of an issue 'cos people will often omit diacritics, and mongo won't find a match for, say, 'ενα' (should be 'ένα'). Right, so what i've done is duplicate fields i wanna make searchable, strip all diacritics, and make them lowercase. When a user performs a search, the search text

[11:05:03] <uhrt> has its diacritics stripped and made lowercase before we query the db. What i'm wondering is if there's something else i could or should be doing that won't mean cluttering the db with special search fields (that is, apart from resorting to using lucene/solr/fml)

[11:05:45] <Derick> not until we land greek support into FTS I think

[11:07:44] <uhrt> all right then, thanks

[11:11:41] <Derick> hi hannesvdvreken

[12:04:31] <Siyfion> Given an example collection of products: https://gist.github.com/Siyfion/d10d2a2894534a0cf7ed

[12:04:52] <Siyfion> Is there any easy way I can run an update command to remove all "link" values that match a given ID?

[12:07:23] <Siyfion> eg. If I wanted to null the values that currently reference "539ed92ea53e1402005f888f", I'd like the 'fields.image.link' on object 1, the 'fields.other.link' on object 2 and 'fields.image.link' on object 3 to be nulled.

[12:40:46] <remonvv> Siyfion: If you mean remove them from both the "image" and the "title" embedded documents then the answer is no. If you only need to do so for "image" then it's possible through update({'fields.image.link': <YOURID>}, {$unset:{'fields.image.link':1}}, false, true)

[12:41:06] <remonvv> Pretty sure that what you actually want is the former, and that's not possible as there are no wildcard operators.

[12:41:31] <remonvv> It's solvable but you need a different schema for that.

[12:41:36] <Siyfion> Yeah it may / may not be called "image"

[12:41:48] <Siyfion> I think I'm coming to that realisation

[12:42:11] <Siyfion> If I changed 'fields' to be an array, and add a "binding" key

[12:42:16] <remonvv> Then yes, a different schema. It needs to be an array of elements where one of the element fields is "type". That way you can select all elements that match one or more types and specific link values in one go.

[12:42:20] <remonvv> Exactly.

[12:42:35] <Siyfion> then I could just do 'fields.link'

[12:42:47] <Siyfion> Right, glad to know I'm on the right track :D

[12:43:11] <Siyfion> plus it's a more strict schema that way too

[12:44:31] <remonvv> It's simply better. You rarely want to store a collection as something other than an array. Even maps are generally best stored as an array with elements that contain both key and value(s).

[12:51:41] <tuxtoti> Hi. I'm trying to figure out why a particular query is taking a long time to execute..and in the process learning .explain's output.

[12:51:59] <tuxtoti> I have included as much data as possible here: http://pastebin.com/m6eJkwbG

[12:52:47] <tuxtoti> i just can't understand why my query is taking a long time even though the querying field is indexed (and being used according to explain)

[12:53:39] <tuxtoti> Can anyone help me out?

[13:23:28] <joannac> tuxtoti: I'm really confused by that query

[13:25:44] <joannac> it seems useless

[13:25:51] <joannac> anyways, your index is not sparse

[13:27:05] <joannac> that means the index entries that have c4_in:null contains documents where either c4_in is the value null, or c4_in doesn't exist

[13:27:22] <joannac> but you only want the ones where c4_in is the value null

[13:27:38] <joannac> so it has to page in all 12M+ documents to check

[13:27:56] <joannac> which is why it takes so long, and your numYields is so high

[13:53:11] <cocotton> hey guys, when trying to login mongo (runing mongo command), I get the errno:111 connection refused. From what I can see, mongos is not running

[13:53:26] <cocotton> I'm really new with mongo, anyone has an idea of what might be wrong?

[13:55:02] <rspijker> cocotton: are you running a mongo server locally?

[13:55:09] <cocotton> I am

[13:55:20] <rspijker> is it listening to the correct ip?

[13:55:36] <rspijker> on some installations (e.g., ubuntu iirc) the default bindIP is weird

[13:56:05] <cocotton> I can see mongod listening on the correct port

[13:56:12] <rspijker> 27017?

[13:56:19] <cocotton> Yet on other machines (other environment) I also see a mongos process listening on another port

[13:56:39] <cocotton> mongod is listning on 27019

[13:56:51] <cocotton> on other machines, I got mongod listenening on 27019 and mongos listening on 27017

[13:56:53] <rspijker> ah, mongo without parameters will try to connect to 27017

[13:56:56] <joannac> mongo --port 27019

[13:57:13] <rspijker> you only need a mongos if you are using a sharded setup

[13:58:03] <cocotton> It is

[13:58:10] <cocotton> a sharded setup, sorry my bad

[13:58:39] <rspijker> then I’m presuming you haven’t set it up yourself?

[13:58:57] <cocotton> You are right :S

[13:59:33] <cocotton> Yet I'm the one who has to fix it :(

[14:00:51] <BurtyB> does anyone know if failIndexKeyTooLong is likely to stay around or is it meant just for transitioning to 2.6?

[14:02:11] <cocotton> @rspijker From what I can see, this is a "newer" version of mongo, and the package mongo-10gen has been replaced

[14:02:32] <cocotton> Not replaced, uninstalled*

[14:02:42] <cocotton> thus the mongos command does not exist anymore

[14:03:37] <rspijker> cocotton: ah, you can install: mongodb-org-mongos

[14:03:56] <rspijker> but normally you would install: mongodb-org

[14:04:08] <rspijker> it’s a metapackage that contains server, mongos, tools and the client

[14:04:34] <rspijker> seems weird that you only have mongod and mongo, that would mean someone installed those separately

[14:04:49] <rspijker> or they haven’t been replaced or something. Hard to tell without knowing your setup

[14:05:46] <tuxtoti> joannac: I'm actually interested in documents where c4_in doesn't exist. Yes, I could have used $exists but that didn't help either.

[14:08:26] <tuxtoti> joannac: I don't understand ..when you say it has page all 12M+ docs. I 'm having a covered query right? (the projection has c4_in:1 and _id:0 ) ?

[14:08:38] <tuxtoti> joannac: so why does it have to go through all of it.

[14:12:41] <tuxtoti> joannac: essentially, what do I do to make this fast?

[14:23:34] <rspijker> tuxtoti: redesign… $exist doesn’t work well with queries

[14:24:32] <tuxtoti> rspijker: ..yeah I read that somewhere... a lot of confusion over whether it uses the index or not.

[14:25:04] <rspijker> I think it might have changed over time, actually…

[14:25:15] <rspijker> it also depends on what “x”:null means as a query

[14:25:32] <tuxtoti> rspijker: thats precisely why i use c4_in: null which should return documents which don't have c4_in and which has c4_in 's value set as null

[14:26:12] <rspijker> that’s my understanding as well

[14:26:22] <rspijker> but then joannac’s explanation makes no sense to me...

[14:26:47] <rspijker> if the index were to store it as null regardless of whether the field exists or not, the query should be able to use the index

[14:29:38] <rspijker> tuxtoti: I hadn;t actually looked at your output… the query returns all fields

[14:29:53] <rspijker> so regardless of whether or not it’s indexed, it’s going to be slowish...

[14:30:31] <rspijker> well, I don;t know if it hits everything, that’s an assumption I made due to the 12M that was referenced before

[14:30:38] <rspijker> either way, the query is returning 12M documents

[14:31:02] <rspijker> althoguh >20 mins seems long for that...

[14:31:33] <rspijker> also not sure why it;s not indexOnly

[15:13:26] <mp_> i'm trying to summarize a table that tracks user activity grouped by week + user. anyone have a moment to help with adding in blank data for weeks where a user did not have activity?

[15:18:57] <sflint> mp_: why do you need to add in blank data?

[15:22:11] <mp_> sflint : i'm displaying graphing counts by week and would like to prepare the data correctly at the source

[15:22:33] <mp_> i guess i could hack it in there in my UI/etc, but would be more correct to do it at the source

[15:24:33] <sflint> mp_: you can just have null = 0

[15:24:35] <sflint> in the code

[15:24:54] <sflint> mongo doesn't need to store empty values

[15:25:35] <sflint> but if you want you can just do db.collection.update({<key>:null},{$set : {<key>:0}})

[15:25:40] <sflint> mp_: ^^

[15:25:44] <sflint> that will fill in the null with 0

[15:26:10] <Siyfion> So the docs say that "$pull: { results: { score: 8 , item: "B" } }" will pull all elements that contain a score field of 8 AND an item field of "B"... How would I do it if I wanted to pull all elements that have a score field of 8 OR a score field of 4...?

[15:26:21] <mp_> sflint : so i'm not really trying to store 0s

[15:26:27] <Siyfion> (From the 'results' array)

[15:26:39] <mp_> i just want to aggregate the data and add zeros where there was no data

[15:26:39] <mp_> i.e.

[15:26:53] <mp_> user A has an event in weeks 1,2,4,5

[15:27:18] <mp_> i aggregate and get a list with [{week1...stuff}, {week2...stuff} etc]

[15:27:44] <sflint> Siyfion: try $or []

[15:27:49] <mp_> week 3 and week 6 aren't in the DB, but i want to have records in my results with {week3, count: 0}, more or less

[15:28:24] <sflint> mp_: assuming you are using aggregatioin framework ?

[15:28:35] <mp_> yes

[15:28:39] <mp_> let me get an example of the query

[15:28:40] <mp_> and data

[15:28:48] <sflint> nah i got it

[15:31:17] <rspijker> Siyfion: “results.score”:{$in:[4,8]}

[15:31:19] <sflint> mp_: actually give me the query you are running

[15:32:11] <mp_> http://pastebin.com/cQd5eUzp

[15:32:24] <mp_> sflint : that's the basic setup with some data/etc redacted, but should get the point across

[15:37:21] <sflint> mp_: you are ruling out 0 because of the $exists i think

[15:37:27] <sflint> checking again

[15:37:34] <mp_> sflint: yes, that's true

[15:37:46] <mp_> however, grouping by week in the first place breaks it

[15:38:03] <mp_> i.e. if there is no date, i can't group by the event that doesn't exist in the first place

[15:38:19] <mp_> so i need to do some kind of preliminary grouping by week to indicate the last 12 weeks

[15:38:29] <mp_> then find the event counts after it?

[15:38:59] <mp_> whereas right now, i start at event and filter down the date, i need to reverse it, i think

[15:40:31] <sflint> mp_: yeah i don't us aggregation really at all. I do that in my schema design and the just $inc the indicator

[15:40:56] <rspijker> mp_: that only works if you can guarantee there is an event in every week though

[15:41:35] <mp_> right - i can't guarantee that every week has an event.

[15:42:12] <rspijker> this isn’t straightforward then

[15:42:45] <mp_> hahah, exactly ;-)

[15:43:11] <mp_> should i just kludge the data post-pull or do you guys have any bright ideas?

[15:43:28] <sflint> use a schema design to track messages by month

[15:43:36] <sflint> month->week->day->minute

[15:43:49] <sflint> then capture the event in code and send an $inc

[15:44:17] <mp_> well, the data is ~2 years old, so it's too late to restructure dramatically

[15:44:19] <sflint> could do that for each user

[15:44:35] <sflint> well you could script it to fill in the $inc

[15:44:37] <mp_> or are you suggesting re-shaping the old data

[15:44:38] <mp_> ah, gotcha

[15:46:05] <sflint> you could aggregate the "timestamp" and write that to new collection first

[15:46:12] <sflint> then use that to get each week

[15:47:23] <mp_> yeah, i'm thinking something along those lines has to be the right thing.

[15:48:08] <mp_> i need to start with a group of week numbers and then run my query from that

[15:54:39] <Siyfion> Thanks both rspijker and sflint

[15:56:25] <mp_> yes, thank you. don't have an answer yet but really appreciate your attention.

[17:09:05] <dorongutman> hello

[17:09:30] <dorongutman> how does a soft-delete works in regards to indexing ?

[17:10:21] <dorongutman> a document that is “deleted” has a field named “deletedAt” with a date, and non-deleted documents don’t have it

[17:10:48] <dorongutman> so for getting the the not deleted documents - how do I set the indexes ?

[17:15:13] <TorbenG> is there a mongodb interface to see my dbs? localhost/27017 is not running ...

[17:26:31] <DubLo7> Can one query get documents from two different collections?

[17:27:09] <DubLo7> Like get all users from user collection that are less than 1 year old, and all orders from order collection that are made by those users?

[17:38:58] <cheeser> no

[17:39:02] <cheeser> you need two queries

[18:12:40] <TorbenG> and how can i open cmd again? so that i have two or more cmds open

[18:13:33] <kali> ho my

[18:13:48] <kali> windows-R, type 'cmd', return

[18:15:43] <TorbenG> nice, thanks

[18:16:01] <TorbenG> it works

[18:45:05] <scorphus> Hello, everyone

[18:46:43] <scorphus> I’m trying to create a unique compund index with db.collection.ensureIndex( { a: 1, b: 1 }, { unique: true } )

[18:47:35] <scorphus> I’m wondering where/when should I place/call this statement

[18:48:15] <scorphus> right after any insertion? just after the first one? or is there some way to create or setup an index?

[18:48:40] <scorphus> s/compund/compound/ sorry

[18:49:10] <kali> scorphus: whenever it's convenient

[18:49:19] <kali> scorphus: even before the first insertion

[18:49:25] <scorphus> hmmm

[18:49:52] <kali> scorphus: of course, it can fail if you do it when some documents are already there if there is a dup

[18:50:07] <scorphus> yes, I understand that

[18:51:05] <scorphus> kali: is it ok to call that everytime I insert? I don’t have any “setup” or “initialize” feature

[18:51:50] <scorphus> it is not a collection that receives much inserts, just once in a while

[18:53:07] <kali> scorphus: well, you'll pay a network roundtrip for nothing, and you may need to catch or ignore an error, but apart from that, it's fine

[18:53:50] <scorphus> ok, I’ll see what I do

[18:53:55] <scorphus> thank you, kali

[19:35:06] <davejfranco> hi

[19:35:29] <davejfranco> I need some advice using MMS

[19:36:10] <kali> "davejfranco" ? :)

[19:36:27] <davejfranco> yes

[19:36:34] <davejfranco> my name

[19:36:37] <davejfranco> fullname

[19:39:43] <davejfranco> Does anyone knows how to connect mms agent if a database is on a private subnet? and your aplication server is behind a proxy

[19:40:54] <kali> you need a router to perform NAT for an inner host hosting the agent

[19:42:05] <kali> i mean: the agent will connect the MMS platform, not the opposite. basically... it should work out of the box

[19:42:07] <davejfranco> I'm hosting on amazon ec2, and the databse server can connect via a nat instance but in this case I think it needs a reverse nat

[19:45:21] <kali> i don't think so, it's just good old regular NAT

[19:46:19] <kali> the connection will go from the agent in your private subnet, through your router, to the MMS platform. just NAT.

[19:47:09] <kesroesweyth> I have a collection of documents that all contain a property which contains an array. Is there a simple way to get a single list of distinct keys from all of these arrays?

[19:47:31] <kesroesweyth> Meaning, just one list no matter how many documents the collection contains.

[19:48:10] <kali> kesroesweyth: is "distinct" not working ?

[19:48:11] <davejfranco> I was checking mms create a connection in two ways

[19:48:51] <kesroesweyth> kali: I guess I assumed it wouldn't just magically work since I didn't see a hint about it in the documentation. Let me try.

[19:49:24] <kesroesweyth> It magically worked.

[19:49:26] <kesroesweyth> Gees.

[19:49:33] <kesroesweyth> Thanks kali. :) afk

[19:50:30] <kali> one more documentation [FAIL] i'm afraid

[19:51:15] <kali> davejfranco: i'm not sure about that... have you tried ?

[19:51:46] <davejfranco> kali: about what?

[19:53:10] <kali> davejfranco: about the mms cloud initiation connections

[19:54:44] <davejfranco> I tryed on a mongo in a public subnet and it works perfectly but in a production environment I want that DB on a private subnet

[19:54:54] <davejfranco> I'm also trying with datadog

[19:55:20] <kali> davejfranco: i'm pretty sure it will be happy with a regular NAT setup

[19:55:48] <davejfranco> the mms I think not only receive from the db, mms query your db

[19:55:57] <davejfranco> so it need to connect directly

[19:56:03] <kali> no. the agent does

[19:56:33] <kali> the agent talks to your database (so it needs to be on the private subnet) and talks though the NAT to the MMS cloud

[19:56:52] <kali> but MMS will not connect to your stuff

[19:57:09] <davejfranco> I will continue testing to see if I can make it work

[20:02:48] <Viesti> ugh, stuck at "replSet error RS102 too stale to catch up" :(

[20:03:08] <Viesti> do I really need to delete all data files and do a sync from scratch?

[20:03:13] <Viesti> again...

[20:03:34] <Viesti> just did that but apparently the oplog isn't big enough on the primary so it got rolled over

[20:03:55] <tscanausa> you can sync from scatch or you can rync the data files

[20:04:03] <tscanausa> rsync*

[20:04:31] <Viesti> I have data files on the secondary from a previous initial sync from scratch

[20:04:42] <Viesti> aren't they good enough? :/

[20:04:46] <tscanausa> you will need newer files

[20:04:51] <Viesti> right

[20:06:03] <kali> Viesti: it may be a good idea to allocate more space to the oplog

[20:06:50] <kali> Viesti: look at the replica set on the primary, it will state how much time the oplog covers. if this time is not bigger than the time you need for a full resync, it will likely fail again

[20:08:16] <Viesti> yep, just increased oplog size on the secondary, but I guess the problem is the too small size on primary

[20:08:45] <Viesti> so rsync files from primary to secondary would an option?

[20:09:00] <kali> Viesti: that will only work of you stop the primary

[20:09:05] <Viesti> :/

[20:09:36] <Viesti> so I have this long running data import onto the primary

[20:09:44] <Viesti> which probably causes this problem

[20:09:45] <kali> ha

[20:09:47] <kali> yeah

[20:10:00] <kali> it all goes straight to the oplog

[20:10:10] <Viesti> which then rolls...

[20:10:23] <Viesti> so I have to basically skip or pause it

[20:10:26] <kali> ... before the secondary has a chance to catch up

[20:10:29] <Viesti> yep

[20:11:06] <kali> well, at least you understand the options

[20:11:07] <Viesti> just wondering that I have data now in the secondary, but I guess it's then useless because it's stale...

[20:11:11] <Viesti> :D

[20:11:54] <kali> yeah, if you dont have all the oplog from that secondary optime to now, the data is useless

[20:15:34] <Viesti> meh...

[20:18:33] <Viesti> is there another way to force initial sync than to just destroy data files

[20:18:42] <Viesti> like from mongo shell

[20:19:56] <kali> stop, delete dbpath content, start. that's all i know :)

[20:21:00] <Viesti> :)

[20:26:24] <Viesti> if I delete data files, then how do I specify oplog size?

[20:26:29] <Viesti> in the config?

[20:27:11] <kali> config file or command line options

[20:27:23] <Max-P> Hi, does any one know which is better/faster between multiple sparse indexes or just one big shared index?

[20:28:22] <Max-P> I have a big collection that holds types of objects, some of which have external references to other collections, but not always on the same field. Trying to see if I should index them all or have just one big "externalReference" field and index that.

[20:28:29] <kali> Max-P: it depends on the query. if you only have one kind of query, it's better to have an index taylored for it

[20:29:03] <kali> Max-P: yeah, denormalize it in an array an index it

[20:30:55] <Max-P> kali, Basically, I can have something like {type: "something", somethingId: ObjectId('...')} and {type: "otherthing", otherthingId: ObjectId('...')}, and I will always query with the `type` field set

[20:32:16] <Max-P> It's really either one or the other, so I'm wondering if it costs more to maintain two indexes and if I should just use a common field "otherId" for every entry

[20:33:21] <kali> i would maintain a unified index, i guess. but if it's only two index, it's still manageable.

[20:33:50] <kali> Max-P: an index on (type, somethingId, otherthingid) is useless

[20:34:26] <kali> i meant "unified id", not index

[20:34:36] <kali> ho my, i'm not making any sense

[20:34:48] <Max-P> kali, it's fine :)

[20:37:58] <Max-P> Actually that's going to be more like 5-6 of them, but all of them will be sparse as only their appropriate type will have the matching indexed field

[20:38:26] <kali> yes

[20:38:32] <Max-P> I'm really more worried about insert/update time than querying actually

[20:39:13] <Max-P> So to simplify the question, does Mongo even bother updating a sparse index when the corresponding field does not exist?

[20:40:06] <kali> well, if you have reasons to worry, you have to run a benchmark

[20:40:32] <Max-P> Alright, thanks kali !

[20:55:57] <Max-P> Results are it's about 10% faster to use one big index on average, so I'll go with the unified index.

[21:00:34] <Max-P> Yeah the more indexes you add the more time it takes even if they are unused. 10% hit with 10 indexes, 20% hit with 20 indexes

[21:16:32] <ranman> Max-P: I have some questions about that, do you have a simplified test case for that?

[21:17:01] <ranman> wait the 10% on insert?

[21:18:50] <ranman> Max-P: if I read and understand your usecase correctly you could make use of compound indexes and prefixes: http://docs.mongodb.org/manual/core/index-compound/#compound-index-prefix

[21:28:04] <Max-P> ranman, No, I'm not using any compound index. None of the index have any relation between them

[21:28:42] <Max-P> I'll always be querying only one at a time

[21:30:16] <Max-P> ranman, http://pastie.org/9474321 here's the test I ran

[21:30:50] <Max-P> Pretty ugly, but did the job for what I needed to know. At any point only one of the said fields will be present at a time.

Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 14th of August, 2014