PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 12th of June, 2012

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:02:23] <multi_io> can update() return the new row, or at least a field from it?
[00:02:38] <multi_io> *new document, rather
[02:24:20] <krz> http://pastie.org/4071109. it will always push to the array 'visits', if visits is an existing array. which is fine. but what if i wanted to $push to visits, only if name foobar did not exist. if it did, then just update the token. what should i be using?
[02:31:24] <whaley> is there a way to initiate a replica set on a database from the java driver? (this is for some system tests where I'd like to have a replica set available but created at runtime of the tests since the databases are setup/teared down after test runs)
[02:33:34] <whaley> I'm going to assume it's a DB.command("rs.initiate({configstuff})"). Going to give that a shot.
[02:33:46] <whaley> thanks for being my rubber duck
[05:03:02] <carsten> the stupid pmxbot is still there? kick this privacy violating bot please
[07:37:24] <[AD]Turbo> hi there
[08:39:24] <Saby> hi, i was wondering if it is possible to write a query which updates all records whose currentDate - previousDate > 2 hours in a collection
[08:40:26] <carsten> simple usage of $gt or $lt operators
[08:40:43] <carsten> unless currentdate and previousdate are two fields on the ojbect
[08:41:19] <Saby> carsten currentdate and previousdate are 2 date fields
[08:41:51] <carsten> you're lost
[08:42:22] <Saby> oh
[08:42:35] <Saby> could please explain
[08:43:18] <carsten> nothing to be explained. this can not be done client side except using the $where operators which is not recommended
[08:43:27] <carsten> explained very often - so read up
[08:57:35] <NodeX> you want currentDate minus previous date greater than 2 ?
[08:58:44] <Saby> NodeX: yes
[08:58:55] <Saby> greater than 2 hours
[08:59:06] <NodeX> and you want mongo to do this ?
[08:59:10] <carsten> you just got a very good answer
[08:59:12] <carsten> didn't you?
[08:59:13] <Saby> yes NodeX
[08:59:18] <NodeX> as in you want mongo to do the caluculation
[08:59:24] <NodeX> calculation*
[08:59:25] <Saby> yes NodeX
[08:59:29] <NodeX> map/reduce
[08:59:42] <carsten> don't listen to NodeX
[08:59:55] <carsten> MR is not reasonable solution here
[09:00:06] <Saby> ok
[09:00:27] <Saby> so basically i should retrieve all the records and do the calculation in my program ?
[09:00:48] <NodeX> yep
[09:01:12] <Saby> that could be very intensive as the number of records keep increasing
[09:01:49] <Saby> this thread would keep running every one hour, so that is why i wanted to get it done on mongo db side
[09:01:56] <Saby> as it would reduce the network usage too
[09:02:27] <NodeX> you're going to have to provide one of the dates
[09:03:40] <Saby> NodeX: both dates are in the records
[09:03:58] <NodeX> no .. your APP will need to provide one of the dates
[09:04:13] <NodeX> to put it simply .. without map/reduce Mongo will not calculate things for you
[09:04:57] <Saby> NodeX: db.coll.find output >> http://pastie.org/4072459
[09:05:04] <Saby> my collection has records like that
[09:05:32] <NodeX> [10:01:47] <NodeX> to put it simply .. without map/reduce Mongo will not calculate things for you7
[09:05:43] <Saby> ok
[09:05:58] <Saby> I'll try it with a MR
[09:06:09] <Saby> if it is not stable then will get it done in the program
[09:06:31] <NodeX> map/reduce will be slow on large data
[09:06:50] <Saby> what about like 20k records
[09:07:08] <Saby> i dont think we would exceed more than 20k records
[09:07:29] <Saby> carsten: could you explain why MR won't be a reasonable solution here ?
[09:07:46] <NodeX> it is reasonable but it will be slow
[09:08:11] <NodeX> I have carsten on ignore because he is annoying so I don't know what advice he is offering
[09:08:30] <Saby> oh ok
[09:08:47] <Saby> his advice was to get it done in the program
[09:09:07] <carsten> hehe
[09:09:27] <carsten> he can not here complaints about his often wrong advices
[09:09:37] <carsten> back to breakfast
[09:09:51] <NodeX> that's my advice because it will be faster
[09:10:09] <NodeX> if you -must- do it in mongo then you -have- to use Map/reduce
[09:10:30] <Saby> ok
[09:10:31] <Saby> thanks NodeX
[09:22:14] <horseT> Hi, I need some help regarding the aggregation framework, I want to do somethink like Select sum( point) from score GROUP BY date;
[09:23:14] <horseT> I'm not sue that it's doable :(
[09:23:27] <Carsten> perfectly doable
[09:26:13] <horseT> i am able to group by date but I need to group by day
[09:26:32] <horseT> {
[09:26:34] <horseT> "dcomp" : ISODate("2012-06-11T19:56:33.601Z"),
[09:26:36] <horseT> "sum" : 500
[09:26:38] <horseT> },
[09:26:40] <horseT> {
[09:26:42] <horseT> "dcomp" : null,
[09:26:44] <horseT> "sum" : 4700
[09:26:46] <horseT> },
[09:26:48] <horseT> {
[09:26:50] <horseT> "dcomp" : ISODate("2012-06-11T19:59:41.475Z"),
[09:26:52] <horseT> "sum" : 500
[09:26:54] <horseT> },
[09:26:56] <horseT> {
[09:26:58] <horseT> "dcomp" : ISODate("2012-06-11T20:09:04.327Z"),
[09:27:00] <horseT> "sum" : 500
[09:27:02] <horseT> },
[09:27:04] <horseT> {
[09:27:06] <horseT> "dcomp" : ISODate("2012-06-11T20:09:04.191Z"),
[09:27:08] <horseT> "sum" : 500
[09:27:10] <horseT> },
[09:27:11] <NodeX> PASTEBIN
[09:27:12] <horseT> {
[09:27:13] <rexxars> pastie would be a preferable way of doing that...
[09:27:14] <horseT> "dcomp" : ISODate("2012-06-11T20:13:59.971Z"),
[09:27:16] <horseT> "sum" : 500
[09:27:18] <horseT> }
[09:27:20] <horseT> this is not correct for me, I need to have only one line ISODate("2012-06-11")
[09:28:34] <Carsten> could you stop with your flooding please?
[09:29:14] <horseT> no prob
[09:31:43] <horseT> any idea to group by day without storing extra data like day number ?
[09:32:59] <Carsten> you can group only by something that you have
[09:33:26] <Saby> NodeX: Map function -> http://pastie.org/4072562
[09:33:33] <horseT> Thanks, I've got my answer
[09:34:00] <Saby> how should i write the reduce function, which basically checks if previousDate-currentDate > 2 hours then set priority: 2
[09:34:35] <horseT> Saby : I ll have a look
[09:34:47] <Saby> thanks horseT
[09:43:27] <Saby> NodeX: http://pastie.org/4072602
[09:43:31] <Saby> should this be good ?
[09:43:43] <Saby> Carsten: http://pastie.org/4072602
[09:47:42] <Saby> I'm assertion" : "invoke failed: JS Error: TypeError: value.previousDate has no properties nofile_b:2",
[09:47:50] <Saby> I'm getting that error >> assertion" : "invoke failed: JS Error: TypeError: value.previousDate has no properties nofile_b:2",
[10:12:12] <siimo> is there a sqlite style mongo version ? where i can store everything in single file and use a library to query it?
[10:15:55] <Carsten> no
[10:16:21] <siimo> what about any other document db that is file based?
[10:17:53] <NodeX> what's wrong with mongo ?
[10:19:29] <Saby> NodeX: any insights why my MR doesn't run ? http://pastie.org/4072721
[10:20:34] <NodeX> wiht what error Saby ?
[10:20:39] <NodeX> with*
[10:20:50] <Saby> it doesn't update the documents
[10:21:58] <NodeX> it's probably not calculated it correctly
[10:22:04] <NodeX> remove the if().. and see if it does
[10:26:39] <Saby> NodeX: still doesn't update any records
[10:43:26] <siimo> NodeX: nothing, i just need a self contained solution
[10:43:35] <siimo> dont want to have to ship mongo with it
[10:46:17] <Carsten> MongoDB is not an embedded database
[10:46:21] <Carsten> look for something else...
[10:46:39] <siimo> yea
[10:57:36] <NodeX> berkley-db ?
[11:00:14] <Saby> Carsten: any insight why this MR won't update the records ? http://pastie.org/4072721
[11:01:23] <Carsten> MR is for updating data?
[11:01:29] <Saby> yes Carsten
[11:01:30] <Carsten> MR is for aggregating content
[11:01:35] <Carsten> and not for modifiying content
[11:01:53] <Saby> isn't it possible to modify the content through MR
[11:01:53] <Saby> ?
[11:01:59] <Carsten> N O
[11:02:05] <Saby> oh damn
[11:02:13] <Saby> i spent 2 hours wasting my time :P
[11:02:24] <Carsten> your problem...
[11:02:35] <Carsten> it's called *aggregation*
[11:02:36] <Saby> thanks though :)
[11:42:01] <pilgo> How do I access fields in a document that I got using find()?
[11:53:30] <moshef> pilgo: obj[:attribute] ?
[12:15:02] <pilgo> moshef: I guess I needed to do findOne instead of find
[12:15:04] <pilgo> thanks tho
[12:29:16] <spillere> i'm really new to mongodb, i have a simple question, i'm doing a foursquare app, and i want to save the user and all his checkins in a db
[12:29:28] <spillere> for his checkins, shout I do it as tags?
[12:30:57] <spillere> { user: 'name', checkin: [ name: [ 'velue1', 'venue2'], latitude : ['lat1', 'lat2'] ] } ?
[12:31:53] <spillere> ahh, go it
[12:40:14] <spillere> anyone here?
[12:41:05] <spillere> is this right for a database scheme? http://pastie.org/4073261 as for one user will be lots of checkins
[12:43:18] <NodeX> spillere : it depends on what gets queried more
[12:44:01] <spillere> I wanna query the user info and ALL the checkins, doing a FOR and displaying all checkins
[12:45:41] <spillere> would that work?
[12:48:42] <NodeX> which will you query more ..
[12:48:56] <NodeX> a single user with all thier checkins
[12:49:04] <NodeX> or ... checkins for a place
[12:49:16] <spillere> a single user with all his checkins
[12:50:03] <NodeX> checkins should be an array like you have it then
[12:50:13] <NodeX> be aware of document size limits
[12:53:31] <spillere> NodeX what would be a good limit
[12:53:37] <spillere> can it be until like 10k items?
[12:53:48] <NodeX> you dont choose
[12:53:58] <NodeX> currently it's 16mg per document iirc
[12:54:04] <NodeX> 16mb *
[12:54:23] <spillere> well, 10k items won't be 16mb :
[12:54:24] <spillere> :)
[13:23:25] <Industrial> Where did I go wrong with installing mongodb? I have both mongodb-server and mongodb-clients installed and the mongodb server is running but I don't see it in ps -e|grep mongo https://gist.github.com/1014dc4231f55a392f5f
[13:23:31] <Industrial> erm, this is on Ubuntu
[13:23:39] <Industrial> Does mongo keep a log somewhere :P?
[13:23:49] <Derick> /var/log/mongodb/mongodb.log for me
[13:24:10] <rick446> probably an unclean shutdown leaving a lock file behind
[13:25:33] <Industrial> oh, it was the lock file
[13:25:37] <Industrial> cheers
[13:28:34] <multiHYP> hi
[13:29:38] <multiHYP> i have a complaint to make. why is the java driver not supporting nested field access like the mongo shell?
[13:29:39] <multiHYP> shell example: "name": { "title": "some title"}
[13:30:07] <multiHYP> java driver example (which is not working): new BasicDBObject("name.title", "some new title")
[13:30:41] <multiHYP> my collections look like crap because of this shortcoming in the javadriver.
[13:36:24] <kali> multiHYP: new BasicDBObject("name", new BasicDBObject("title", "some new title")) :P
[13:36:38] <multiHYP> kali: genius!
[13:37:02] <multiHYP> how many more nested BasicDBObjects would you define until you'd think enough is enough?
[13:37:44] <multiHYP> my collections look like crap because of this shortcoming <not impossibility> in the java driver. I thought it was self evident.
[13:38:44] <kali> i don't understand what you're complaining about.
[13:39:27] <multiHYP> its ok. ignore it.
[13:40:03] <multiHYP> my complain is towards mongodb java driver designers.
[13:45:04] <multiHYP> so here I read that the java driver is just a json parser: http://stackoverflow.com/questions/3580529/how-do-i-update-fields-of-documents-in-mongo-db-using-the-java-driver brilliant, what an achievement. :(
[13:46:24] <algernon> ("name.title" at insert time works the same in the shell as it does with java. that BasicDBObject(...) is longer than {...} is another detail.)
[13:46:40] <algernon> java is verbose. tough luck.
[13:46:48] <maik_> hi all
[13:46:59] <maik_> anyone experience with "getlasterror" and locks?
[13:49:53] <maik_> We have a thread running, but he is waiting for whatever. Last line of threadump:
[13:49:53] <maik_> java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method)
[13:50:01] <maik_> network issue?
[13:52:09] <multiHYP> ok here i have this: {"detail": {"title":"a", "name":"b", "address":"c"}} how would you $set the title only?
[13:52:18] <multiHYP> without removing the rest of them.
[13:52:27] <multiHYP> in java
[13:52:41] <multiHYP> I know how to do it in shell, but not in java
[13:54:01] <multiHYP> algernon: no way, i don't know whether you used the java driver at all. java driver does not even support "detail.title" type of notation.
[13:57:06] <multiHYP> people talking out of … I rest my case.
[14:00:58] <multiHYP> I put my case back on the table. here is something perhaps new to algernon too
[14:02:16] <multiHYP> strictly java driver related: if you use $set, the "detail.title" notation nested in a "$set" works. otherwise it doesn't. So the only alternative in those cases is to pass a json object very similar to the shell command.
[14:03:28] <algernon> multiHYP: I still don't see your issue. update expects dotted notation, insert doesn't. same is true for the shell.
[14:03:32] <kali> multiHYP: honestly, i think you're confused.
[14:03:36] <algernon> nothing to do with the driver.
[14:04:23] <multiHYP> upsert without $set in shell supports dotted notation, but in java driver doesn't. <- I cannot express it more clearly than that.
[14:05:47] <spillere> im trying to connect using flask/python using db_connection = pymongo.Connection()
[14:05:47] <spillere> db = db_connection.4sq
[14:06:07] <spillere> but it gives and error invalid sintax db = db_connection.4sq
[14:07:30] <multiHYP> spillere: doesn't 4sq require some API token?
[14:08:05] <spillere> i mean, i created one database called 4sq
[14:08:10] <spillere> could be anything
[14:08:27] <Ratler> try db_connection['4sq']
[14:08:27] <multiHYP> oh nm
[14:08:35] <spillere> just wanna connect and write some data
[14:08:48] <spillere> its python/flask
[14:09:16] <algernon> multiHYP: you're doing something wrong then.
[14:09:22] <Ratler> Could be that a leading number just doesn't work using attribute style (not a python expert) :)
[14:09:54] <multiHYP> algernon: nah solved it. java looks really ugly though.
[14:10:26] <NodeX> even ugly things need love!
[14:19:36] <algernon> multiHYP: agreed on java looking fugly :)
[14:20:13] <Ratler> There are languages much more fugly than java though
[14:21:56] <algernon> Ratler: which one, out of curiosity?
[14:22:32] <Ratler> erlang or tcl just to name some ;)
[14:22:55] <Ratler> But erlang probably takes the price
[14:24:27] <multiHYP> java: ugly but manage-able. python: nice but out of bounds.
[14:24:50] <multiHYP> I do scala and java apis of mongodb
[14:25:03] <multiHYP> still waiting for casbah support.
[14:25:15] <multiHYP> C is good :)
[14:25:51] <algernon> Clojure is better ;)
[14:28:25] <spillere> on mongo shell, how do I show ojects of a collection?
[14:28:45] <algernon> db.collection.find() ?
[14:29:27] <spillere> yeah, ty!
[14:42:42] <alexmic> hey I got a question about resident and shared memory
[14:43:01] <alexmic> top shows 7.5 RES and 7.5 SHR in a machine with 16GB of memory
[14:43:11] <alexmic> what's SHR used for?
[14:43:26] <alexmic> does that mean only 7.5GB are available to mongo?
[14:55:25] <spillere> how do I check if an object called 'id' has a def number
[14:55:29] <spillere> like
[14:55:42] <spillere> if exists db.id == 5 then...
[15:17:49] <m_rec> hi everyone, is there a reason rs.status() would no longer return stateStr when being run? since the replica set elected a new primary I can't seem to find that field anymore
[15:34:20] <Killerguy> hi all
[15:34:38] <Killerguy> to backup a replicaset with fsync, to flush write
[15:34:55] <Killerguy> I need to run db.runCommand({fsync:1});
[15:35:04] <Killerguy> on master, slave or all ?
[15:41:37] <pingboy> just a quick q
[15:41:47] <pingboy> knowledge base == community in the support forums right?
[15:42:02] <pingboy> or community is a subset of the knowledgebase?
[15:47:11] <skot> Same.
[15:48:28] <pingboy> same as in knowledge-base is the community
[15:48:30] <pingboy> correct?
[15:49:20] <pingboy> oh wow.. that's what i get for using my ancient cli based irc client...
[15:49:22] <pingboy> i'm in the wrong channel....
[15:49:30] <pingboy> good thing i didn't ask for help on my genital herpes
[15:49:57] <pingboy> boy that would've been embarassing
[16:45:20] <jasiek> if i have a data migration that removes old versions of documents and inserts new ones, am i likely to benefit from batch inserts?
[16:53:54] <skot> probably not
[16:59:37] <c3l> If I have, in my collection, a document like the following, how do I uniquely identify a "subdoc"? {title:"Some Title", subdocs:[ {title:"Subtitle one", title:"Subtitle two"} ]} For instance, there is nothing stopping the subdocs from having the same title.
[17:00:35] <NodeX> unique index?
[17:02:08] <c3l> NodeX: Im quite new to mongodb, I dont really understand. Yes I would like to have some unique index for each subdoc. Should I manually set a "_id":ObjectId(..) key/val pair for each "subdoc"?
[17:04:10] <NodeX> you can do an upsert on subdocs.title
[17:04:19] <NodeX> or you can ensure a unqiue index on it
[17:05:38] <c3l> I cannot ensure that the titles (or any other field for that matter) will differ, since these can be manipulated by the user. What would be the best way to ensure a unique index on it?
[17:06:30] <NodeX> you can ensureIndex(); on it then
[17:07:37] <c3l> oh, I didn't know there was such a thing. Ill look into it, thanks!
[17:10:53] <NodeX> looks at unique:true,dropDups:true
[17:21:17] <skot> dropDups is dangerous as it doesn't define with dup will be kept
[17:43:24] <jshanks> I found that there is a hex_md5() function available on the server (for use in server side javascript), is there also a function for sha1 built in?
[17:44:45] <dotty> Hi everyone. I want to ask a question about MongoJS (Node.JS library) - should I ask here or in #Node.js?
[17:48:48] <dotty> I'll ask in both I guess. I'm attempting to connect to MongoDB using the following: "require('mongojs').connect('localhost');". This worked on my work machine but, now I'm on my laptop, it's not working and I'm getting the following error: "database name cannot be the empty string" - any clues as to why?
[18:18:04] <jshanks> Does anyone know if there is a built in sha1 function for the js mongo shell?
[18:26:05] <kali> jshanks: any javascript implementation should work, but i don't think there is anything builtin
[18:27:19] <jshanks> ok thanks kali. since there is a built in "hex_md5" function, I was hoping there was a built in sha1 function as well.
[18:29:29] <jstout24> question… which schema is better… we have visitors with attributes… here: https://gist.github.com/e185cc4a565948b00937
[18:30:01] <jstout24> my assumption is that 3 may be the best choice?
[18:33:43] <kali> jstout24: i would go for 2/
[18:34:22] <kali> jstout24: I don't like to twist the model... I wouls also use a boolean there instead of a string
[18:35:21] <jstout24> kali, yeah, currently it's up to the user to create "checkboxes" or form fields and those values are stored
[18:35:48] <jstout24> but we can def do checks on the application layer to do: if ('yes' === $value) $value = true, etc
[18:37:48] <kali> ok, it make sense if you data is that flexible
[18:39:02] <jstout24> oh i didn't know you can use $in for non arrays
[18:40:19] <jstout24> i wonder the performance of `db.visitors.find({ email: 'some@email.com', 'attributes.likes_fat_chicks': 'yes' })` vs `db.visitors.find({ email: 'some@email.com', 'attributes.likes_fat_chicks': { $in: ['yes'] } })`
[18:42:06] <kali> jstout24: pretty much the same, as the optimizer will use the index on email
[18:42:36] <jstout24> nice… so my application layer doesn't have to figure out which to use for a performance gain :)
[18:42:40] <jstout24> the same query can be used everywhere basically
[18:43:00] <jstout24> mongodb is pretty legit ;P
[18:43:21] <jstout24> kali, i just realized i used my name in all the examples and i admitted to liking fatchicks
[18:43:49] <tystr> lol
[18:44:01] <mkmkmk> hey all
[18:44:07] <kali> jstout24: i don't mind :P
[18:46:29] <mkmkmk> i have a collection that has 1000-2000 writes/sec… they're actually upserts.. think of it like a unique visitor log.. if the entry doesn't exist, it's created and the _id is a standard mongo object id.. if it does, i $push a small string onto an array on the document.. im working on sharding it, but i need to figure out a good shard key for the visitors collection
[18:46:56] <mkmkmk> since im using objectid's for _id, that falls into the trap of pushing most of the writes to a single shard
[18:47:51] <Goopyo> Q: Is safe mode needed if journaling is enabled?
[18:52:02] <jstout24> mkmkmk, i'd like to know your solution when you find out
[18:52:06] <jstout24> that's exactly what i'm working on right now
[18:52:13] <jstout24> :P
[19:00:36] <Goopyo> mkmkmk: can I ask you what your mongodb server specs are?
[19:00:49] <mkmkmk> m1.mediums
[19:00:52] <mkmkmk> on aws
[19:00:54] <mkmkmk> er
[19:00:57] <mkmkmk> m1.large sorry
[19:01:06] <Goopyo> single or sharded/replicated?
[19:01:45] <dstorrs> mkmkmk: one option would be to shard based on the md5 of the visitor's name
[19:01:53] <dstorrs> (or email, or etc)
[19:02:00] <dstorrs> That would ensure a roughly random distribution
[19:02:26] <mkmkmk> not tracking any of that except a unique id, which i was using objectid's for
[19:02:38] <mkmkmk> so is that what i should start doing? gen a uuid as well
[19:02:46] <dstorrs> another option would be to include an arbitrary field to the record that comes out of a call to rand()
[19:02:55] <mkmkmk> so _id : ObjectId, uuid: <uuid>
[19:03:01] <dstorrs> it has no meaning, except it makes a good shard key
[19:03:40] <dstorrs> fwiw, I'm in the same boat as you -- just migrating to Mongo, haven't done sharding yet, still figuring it out.
[19:03:49] <dstorrs> This is just my best understanding / thinking atm
[19:04:38] <dstorrs> and you must be storing SOMETHING other than the _id, or there's not much point in the collection. what else is in there?
[19:05:00] <mkmkmk> an array of stuff that has nothing shardable
[19:48:56] <dmuino> i need to do a mongorestore of a large collection on a sharded cluster. The collection + index will not fit on a single replset. I'm currently doing the restore but it seems that all the inserts are going to the first shard, and that will fail
[19:49:23] <dmuino> is there a way for me to get the writes going to all shards? (using mongo 2.0.5)
[19:54:33] <cjm> I'm getting this error, ReferenceError: message is not defined when trying to do a geonear command
[19:54:38] <cjm> what could the cause of this be?
[19:55:18] <dstorrs> dmuino: It may well be that your shard key is such that all writes default to the right-hand shard.
[19:55:43] <dstorrs> In which case, the only solution (AFAIK) is to let the balancer shift chunks around to make space.
[19:56:01] <dmuino> dstorrs: i have 3 shards
[19:56:07] <dmuino> the shards were balanced before
[19:56:29] <dmuino> the dump i have is one that was taken from mongos
[19:56:54] <dmuino> the data does not physically fit in a single shard
[19:57:28] <dstorrs> dmuino: yes, but remember how the sharding works -- mongos looks at the shard key and writes the data to a chunk on the appropriate shard. when a shard starts to fill up, the balancer moves chunks to other shards.
[19:58:27] <dmuino> first i restored the config db, did a flushRouterConfig
[19:58:28] <dstorrs> if your shard key is something that constantly increases (e.g., a date), then all data will initially go to the right-hand shard
[19:59:16] <dmuino> it's a compound key - i have 'name', 'cluster', 'country' - and the data should be quite random
[20:00:06] <dmuino> oh interesting
[20:00:11] <dstorrs> http://www.mongodb.org/display/DOCS/Choosing+a+Shard+Key#ChoosingaShardKey-Writescaling
[20:00:20] <dmuino> i just did a db.chunks.find()
[20:00:26] <dmuino> and all the shards are pointing to my first replset
[20:00:37] <dstorrs> heh. there's your issue, then. :>
[20:27:04] <opra> i am having a hard time setting mongo up for connection through a remote host
[20:27:12] <opra> is there anything i need to change to the mongo conf
[20:47:29] <Xxaxx> hello. how can it be, after db.copyDatabase('vk_photolike', 'vk_photolike', '188.93.20.75') , source: vk_photolike 65.921875GB, destination: vk_photolike 7.9501953125GB
[20:48:17] <Xxaxx> difference in 50GB ?
[20:53:37] <mediocretes> do you delete a lot?
[20:55:51] <leandroa> hi, I have a doc like this: https://gist.github.com/2920068 and I need to add indexes for sorting by sortable_key.en-US and sortable_key.es-AR. What is better, an index to 'sortable_key', or many indexes by keys inside sortable_key?
[21:08:35] <BobFunk> having a really simple query on a pretty small capped collection take incredibly long, even though it should be a simple index lookup
[21:08:43] <BobFunk> documents in the connection are cached http requests
[21:09:09] <BobFunk> look something like {url: "http:...", response: "..."}
[21:09:29] <mediocretes> what are you querying on, and what does the index look like?
[21:10:01] <BobFunk> let me gist it
[21:11:49] <BobFunk> https://gist.github.com/2920152
[21:12:25] <BobFunk> there's just 181776 documents in the collection
[21:12:40] <mediocretes> certainly looks like it's not using that index
[21:12:52] <BobFunk> yeah
[21:13:05] <mediocretes> what's explain() look like?
[21:13:18] <BobFunk> completely different - let me add it to the gist
[21:13:45] <BobFunk> https://gist.github.com/2920152#comments
[21:14:42] <BobFunk> and when I run the query from the console, its fast enough and seems like its using the index
[21:15:02] <BobFunk> but hen the app use the query, something goes wrong
[21:16:52] <multiHYP> night all
[21:19:38] <BobFunk> any ideas?
[21:20:04] <mediocretes> what are you calling from when not the console?
[21:20:51] <BobFunk> it's the java driver (via jruby)
[21:21:04] <mediocretes> an you get an explain out of that?
[21:21:25] <jstout24> Q: if i'm storing data as shown in method 2 (https://gist.github.com/e185cc4a565948b00937), how do i index attributes properly?
[21:21:46] <mediocretes> I'm pretty positive the problem is that it's not using the index, and I don't know why. Good luck!
[21:22:05] <jstout24> for a query like this: `db.visitors.find({ 'attributes.favorite_foods': { $in: ['pizza', 'beer'] } })`;
[21:22:34] <BobFunk> thanks, will see if I can get an explain from it
[21:22:38] <mediocretes> jstout24: http://www.mongodb.org/display/DOCS/Indexes#Indexes-IndexingonEmbeddedFields%28%22DotNotation%22%29
[21:22:48] <jstout24> i was just about to test dot notation
[21:22:52] <mediocretes> bobfunk: that'd be my next step if I were using the ruby driver, I think
[21:23:17] <jstout24> oh, mediocretes, but what if attributes.____ is dynamic
[21:24:03] <mediocretes> then I don't know what happens
[21:24:14] <mediocretes> index attributes, make some junk data, and use explain()
[21:25:34] <jstout24> but yeah, doing what you posted works if the field was known
[21:33:19] <csurap> how can I query a reference field using other that the ID
[21:33:55] <csurap> can I query with any other field like "user.name" instead of "user.$id"
[21:37:42] <BobFunk> hmm, ok - it seems that when I pick a new url the query takes forever, when I pick one that has been queried before, mongo uses the index and it's fast enough
[21:42:54] <BobFunk> hmm, seems the explain takes as long as the query :/
[21:50:33] <jstout24> ping kali
[21:52:05] <dstorrs> csurap: db.coll.find({ 'user.name' : 'bob' })
[21:57:21] <csurap> dstorrs: field user is an referenceOne document
[22:01:52] <jstout24> i need a mongodb genius… here's my problem and attempts at it: https://gist.github.com/eb72b04ce8990bd35b7e
[22:28:27] <BobFunk> really don't know what's up with that collection
[22:28:55] <BobFunk> even a query with a hinted index still takes forever - small collection, indexed field, straight forward query - 56 seconds
[23:17:13] <grafman> Mongodb newbie: I want to iterate over the all of the collections in a db. I'm getting the names using x = db.getCollectionNames(). I'm not seeing any examples on how the use the results or of a better way to interate over all of the collections in a single function.
[23:31:32] <grafman> I'm a mongodb beginner. I want to iterate over the array db.getCollectionNames() gives me. Does anyone have a pointer to how this is done?
[23:31:43] <mediocretes> grafman: db.getCollectionNames().forEach(function(x){ print(x);})
[23:31:46] <mediocretes> (for example)
[23:32:00] <mediocretes> it's javascript, all the usual array methods apply, because it's javascript
[23:32:22] <grafman> You're my hero! Thank you.
[23:33:27] <mediocretes> note: there is nothing special about forEach or the function being passed to it. l = function(x){ … } ; db.getCollectionNames().forEach(l); will work just fine, because it's just javascript
[23:34:28] <mediocretes> you might also enjoy this: for each(var x in db.getCollectionNames()){ print(x);}
[23:34:52] <grafman> You are correct, I would!
[23:35:36] <grafman> Is there a way to iterate over all of the collections? I have a find that will return the last inserted document that I want to run on each one.
[23:36:04] <grafman> like db.collection() or something?
[23:36:56] <grafman> maybe I see a way here...
[23:37:53] <mediocretes> there's also this, saved me a bunch of time: db.getCollectionNames().filter(function(x){return x.indexOf("A") == 0})
[23:37:59] <mediocretes> that will find all collections that start with "A"
[23:38:26] <mediocretes> in the shell? write a function that does what you want and takes the name of a collection, then pass it to foreach
[23:38:42] <mediocretes> f = function(x){ db[x].update({blah}, {bleh});}
[23:38:44] <mediocretes> like that
[23:38:56] <mediocretes> then db.getCollectionNames().forEach(f)
[23:39:06] <grafman> db.getCollectionNames().forEach(function(x){ db.getCollection(x).find().sort({_id:-1});})
[23:39:14] <grafman> I'm trying to do something like that
[23:39:36] <mediocretes> so, you aren't outputting the results, and those results will be sort of ugly
[23:40:08] <grafman> well the idea is to get the last inserted document from each collection
[23:40:19] <mediocretes> I see
[23:40:43] <grafman> Someone else has a python script that eats the output
[23:40:54] <mediocretes> try this:
[23:40:55] <grafman> the guy who does our mongo admin and development quit
[23:41:05] <mediocretes> do you need to know which collection it came from?
[23:41:24] <grafman> no
[23:41:30] <mediocretes> also, are you sure _id is in order? do you have a timestamp field or something?
[23:41:43] <mediocretes> I feel your pain, and I charge $250 an hour :p
[23:41:45] <grafman> _id should be in order
[23:41:47] <mediocretes> ok
[23:41:57] <grafman> lol
[23:42:16] <mediocretes> f = function(x){ db[x].find().sort({_id:-1}).limit(1).forEach(printjson);}
[23:42:25] <mediocretes> then do db.getCollectionNames().forEach(f)
[23:42:49] <mediocretes> let me try that in a shell before I bless it :)
[23:42:52] <mediocretes> yeah, should work
[23:43:05] <grafman> very nice
[23:46:04] <grafman> mediocretes: Thanks, that's perfect
[23:46:29] <grafman> are your a stackoverflow member?
[23:47:07] <mediocretes> probably at some point, not a regular though
[23:48:18] <grafman> Ok, I asked the question there and I'll credit you with the answer if you know the account name you use there
[23:50:35] <mediocretes> same name, thanks!
[23:50:53] <grafman> No problem, you saved me some major gray hairs
[23:51:03] <grafman> you deserve the credit
[23:51:50] <grafman> my manager walked up to me and said we need a report of xyz from mongo, that one Bill used to do
[23:52:13] <grafman> I said "Bill had a mongo what?"
[23:52:29] <grafman> he said "we need it by tomorrow"
[23:52:43] <grafman> so thanks
[23:53:05] <mediocretes> I know the feeling
[23:55:36] <grafman> Mediocretes: is this you? Software Developer at iGoDigital
[23:55:41] <mediocretes> yes
[23:55:56] <grafman> :)