[00:10:29] <sapht> you can check out spidermonkey's stdlib if you have similar problems
[00:10:52] <sapht> same js engine to power firefox
[00:12:11] <tomoyuki28jp> sapht: I see, that's really useful info to me, thanks!
[02:01:20] <tomoyuki28jp> How can I do thing like this? db.news.find().sort(Math.pow(((new Date() - news.updated_at) / 3600000) + 2, 1.5));
[02:05:51] <tomoyuki28jp> To implement reddit's like ranking algorithm, using map reduce is the only way? It should be slower than the native function, right?
[05:41:53] <lupisak> Hi, everyone. what do you guys recommend? mongoid or mongomapper?
[06:15:04] <wereHamster> lupisak: whatever floats your boat.
[06:18:56] <lupisak> wereHamster: well none of them do at the moment :)
[06:52:54] <jwilliams_> how can we know that if balancer finishes its task?
[06:53:36] <jwilliams_> any key word can be used to check if balancing task is accomplished?
[06:54:18] <jwilliams_> with the v 2.0.1, only [balancer] can be seem when it is being executed.
[08:16:59] <Turicas> what's the best way to ensure that a specific index was used in my query?
[08:17:45] <ron> check in the past or ensure in the future?
[08:18:57] <Turicas> ron, the problem is that I want indexOnly=true for some query but it is not using the index I want and then indexOnly=false... (if it uses, probably indexOnly = true)
[09:02:23] <jwilliams_> is there any possibilility that during moving chunk the docs count value may be reduce? e.g. original count is 123 during insert there is a chunk moving, then count() again during the procedure, the count become 100?
[09:35:08] <jwilliams_> i have clients executing insert command with java driver 2.6.5. at the same time, server is running balancer. after insert processes finish, i notice the collection count value may fluctuate
[09:35:34] <jwilliams_> e.g. count before doing insert is 200; first count during insert shows 180, then increase back to 200, then 210. and then again drops to 180.
[09:35:51] <jwilliams_> is this normal (e.g. because of moving chunk)?
[10:03:52] <Turicas> Anyone using Python here? I've just released mongodict: http://pypi.python.org/pypi/mongodict (use MongoDB as a key-value store)
[10:13:27] <newcomer> hi folks.. I'm new to mongodb.
[10:13:38] <newcomer> I have the below structure under RDBMS that I plan to move to mongodb. However when I check the siz of the mongo db I can see it bloats RDBMS - table hotel_basic_info (hotel_id), table hotel_dtl_info, table hotel_images, table hotel_facility, table hotel_blocked_dates I tried creating a Mongodb as hotel_id [basic+dtl+images+facility+blockeddates]
[10:24:38] <newcomer> for each hotel I create a collection then for each collection 1 doc - basic + 1 doc dtl + many docs for imgs + many docs for facility
[10:25:13] <NodeX> does your RDBMS have 65k tables?
[10:25:29] <newcomer> the idea being I pick a collection & get all its related info (basic+dtl+imgs+facility) in 1 shot
[10:25:45] <jwilliams_> is db.runCommand({ flushRouterConfig : 1 }) the right command to flush the cache so that the newer inserted data can be displayed?
[10:25:58] <Derick> newcomer: you should stick all of those in one *document* and all documents containing a hotel + info each, in one collection
[10:27:22] <NodeX> each collection carries an overhead with the journaling iirc which is why your data is so large I would imagine
[10:31:13] <newcomer> I'm running a collection from hotel1 to hotel60k & each of those having their own docs which are [info+basic+imgs(multiple)+facilities(multiple)]
[10:31:39] <newcomer> basic idea being to avoid sql joins & get all data at 1 go
[11:03:29] <newcomer> I'll do 1 thing.. create a collection as "Hotels" then with each for each hotel I'll retrive the basic,details, images, facility as separate docs & then put em into the hotel doc ... that way I shall have only 60k docs with the infos nested
[11:04:07] <NodeX> you dont need to put them into separate docs
[11:29:06] <Bartzy> User liked something. we update that something's document with increment to the like count and the user id. Why enable safe mode here ?
[11:29:41] <Bartzy> kali: OK - so for user data (likes, comments, profile of course) - it is preferable to "lose" the performance benefits and have safe mode on ?
[11:30:23] <kali> Bartzy: this is for you to decide, really, but this is my setup. safe everywhere except for logging
[11:30:44] <Bartzy> kali: And this is usually the best practice ?
[11:30:54] <Bartzy> I saw that mentioned on MongoDB in Action too.
[11:31:04] <Null_Route> Hi Guys! We're running MongoDB 2.0.2, and we've been recently seeing strange network disconnect - replSet unexpected exception in ReplSetHealthPollTask
[11:31:09] <Tankado> Is there going to be (or there is) a version of mongodb which is working on power pc ?
[11:31:10] <kali> Bartzy: this is my practise, and i fully aggree with myself :)
[11:31:25] <Null_Route> Obviously, we're looking into infrastructure errors, (no obvious networking issues)
[11:31:42] <Bartzy> kali: safe mode off means that the driver doesn't even know if the socket is opened? :)
[11:31:43] <Null_Route> ...but I was curious as to whether this is a symptom of any MongoDB issues
[11:31:51] <Bartzy> kali: Or for example if there is a network error ?
[11:32:00] <ron> Bartzy: keep in mind that kali is a hindu goddess, so she knows best.
[11:33:00] <kali> Bartzy: nope, it just does not wait for mongo server to "confirm" the write, but it is received by the server process
[11:33:08] <Bartzy> if my MongoDB server is 2-3ms away from my web server clients (don't yell :)) - enabling safe mode means each write will take at least that much - right ?
[11:33:39] <ron> well, there are different levels of safety.
[11:34:08] <Bartzy> kali: So when safe mode is off, the driver still knows that the server received the request, but doesn't wait for the server to "confirm" that it has written it ?
[11:34:19] <Bartzy> if so, what would make the server reject it...
[11:36:23] <Tankado> anyone has an answer regarding the support of mongoDB in ppc? :)
[11:36:26] <kali> Bartzy: i think the server has an incoming queue for writes, but we're at the limit of my knowledge of mongodb's internal... here be dragons
[11:36:32] <Tankado> its seems its not supported but i want to be sure
[11:38:50] <ron> imagine disk space.. index constraints...
[11:39:08] <ron> if you have a replica set, then possibly unavailability of one of the servers
[11:41:40] <ron> basically, you need to consider the ramifications of data-loss for different parts of your application, and decide whether you want to handle it differently for different data types. if you can 'average' the safety, use that for simplification (especially at first). if not, employ a more complicated solution.
[11:42:13] <ron> for us, most data loss is not something we can afford, so we go safe. I imagine that if we used mongo for logging as well, we could have afforded it to be unsafe.
[11:43:23] <NodeX> I use safe mode for some things and not for others
[11:43:46] <NodeX> general consensus is if you can oafford to lose it then dont use safe mode because performance is imporved
[11:48:50] <NodeX> anyone else think it's a massive waste of gas to burn the olympic torch for 16 days?
[11:49:24] <Tankado> Derick : is mongoDB supported for ppc ? or are there plans to make it work on ppc?
[11:50:01] <Derick> Tankado: I don't think we officially support it, or are planning to do that *right* now. But if you compile it yourself, what doesn't work?
[11:51:22] <Tankado> I didnt try it, i just saw there is a bug with decoding the data listed in few places when i searched it on the web (says mongo only support little endian while ppc is big endian)
[11:51:50] <Tankado> i dont want to start integrating it and then find out its not compatible or has bugs in ppc
[11:52:17] <Derick> why are you using PPC btw? (just curious)
[12:01:58] <Null_Route> My MongoDB issues go pretty deep - But it's the Symptoms which are killing me - Mongos returns a DBException error to the Java collection when it encounters a RETRYABLE issue
[12:02:24] <Null_Route> ...which kills all our mongos's whenever something fails over (which redundancy is suppost to make transparent)
[12:04:13] <Null_Route> Y'see, for some reason, our Mongo's are losing connectivity. As such, the mongos's need to update their sharding tables. The first request to each shard, though, returns an EXCEPTION, which our JAVA code rightly croaks on. But all that's needed is to retry the query. Why does it throw an exception if it's not fatal?
[12:04:33] <ron> solution: fix the damn connectivity
[12:04:59] <Null_Route> ron: Working on it, but that's NOT the problem - I should be able to failover MongoDB's without restarting all my Mongos's!
[12:05:29] <Null_Route> and Mongos shouldn't return an Exception for something which it knows is retryable
[12:06:00] <ron> you don't think that a database server should have its connectivity? I'd say it's A problem if not THE problem. dude, it's your backend!
[12:06:01] <Null_Route> the first of the 2 being the biggest issue. A failover should be COMPLETELY transparent - that's how it's built
[12:06:37] <sunoano> any meteor users in here? if so, do you run meteor on top a replica set, maybe even sharded?
[12:07:13] <Null_Route> ron: There_IS_ an issue with connectivity (or somewhere in the OS/infrastructure), but the systenm STILL shouild be able to failover cleanly
[12:07:20] <ron> Null_Route: sorry, I can't really help you with that :-/
[12:07:56] <Null_Route> ron: You can't help me with the infrastructure issues, I understand. But why is the first suggestion when there is a failover "Have you restarted the Mongos's?
[12:08:18] <ron> Null_Route: no, no, I can't help you with THAT.
[12:09:00] <ron> Derick can, but he doesn't want to.
[12:10:44] <NodeX> ron = biggest troll on freenode
[12:10:49] <ron> Derick: Null_Route wants answers only you can provide.
[12:10:57] <ron> NodeX: dude, you have no idea how wrong you are.
[12:11:04] <Derick> during failover it's possible that you can't do writes until the new primary is elected. During this period, all non-retryable operations (such as inserts/updates) will fail
[12:15:16] <Null_Route> Well, I'm going to continue looking at the networking (try bringing down the port-channels - maybe there's an issue there..)
[12:33:40] <remonvv> Null_Route, everything is your own fault unless you prove otherwise at which point we will claim your question was confusing or incomplete.
[12:34:06] <remonvv> Also, ron wears baby pink briefs.
[12:36:23] <ron> Null_Route: and trust me when I say it, remonvv knows it close-hand.
[12:42:43] <remonvv> I can neither confirm nor deny the truth of that statement.
[12:59:09] <PDani> is there a way I can bulk-update many documents in one request? The problem is that I have many relatively small documents, and it seems to be the network latency is the bottleneck in updating them one-by-one
[12:59:26] <NodeX> update them all with the same value?
[14:25:23] <NodeX> I didn't read your problem correctly
[14:33:03] <Bartzy> NodeX, Derick: So any insight on the general "best practice" limit of the element count ?
[14:33:15] <Bartzy> What would be a bad idea to use $ne on ?
[14:33:45] <Derick> Bartzy: elements inside the array in a document?
[14:34:21] <Derick> you're limited to 16MB per document
[14:34:21] <Derick> and you need to realize that MongoDB might need to move documents around if you're making them larger (by appending to an array)
[14:34:27] <Null_Route> NodeX: Bartzy if you guys remember my issue with failling MongoDB instances - It was caused by CentOS setting ulimit nproc to 1024
[14:34:53] <Null_Route> - it's default for CentOS6
[14:36:55] <Null_Route> but I was getting issues with pthreads not being able to fork
[14:37:12] <Derick> right, cause threads are also "processes"
[14:37:21] <Bartzy> Derick: Yes, I understand that (about moving documents). I have nothing to do about it - people like photos. I need to push that :D
[14:37:43] <Bartzy> Derick: I know of the 16MB per document. How can I get the size of a specific document, or the average size of documents in the collection.. or the maximum size? :)
[14:37:54] <Derick> you could push your likes out of the document into their own collection if it becomes a performance issue
[14:38:25] <Bartzy> Derick: And last question - The 16MB limit is general. What is considered a bad number to do that $ne query on ?
[14:38:38] <Bartzy> Derick: Yeah but I would like to know in what number of likes I should do that
[14:38:54] <Derick> Bartzy: average size you can do on the shell
[14:38:54] <Bartzy> and that is pretty much normalization, which I wanted to avoid by using Mongo in the first place
[14:39:42] <houms> i have signed up for 10gen monitoring and have a test bed of just three servers. They are in a replica set only, and i have the agent running on the 3 server. it shows on the dashboard but it does not show any data or any of the other members
[14:42:41] <Derick> Bartzy: as for the number of likes... difficult to tell without knowing about your data load and access patterns
[14:44:26] <NodeX> Bartzy: I have a ratings collection that's a similar concept, I have around 800 "ratings" in a similar nested document to your's and the performance is good
[14:47:23] <NodeX> surely 100k likes in a document would take forever to render anyway?
[14:47:32] <Bartzy> in the index page I show the likes and views counts, and the last 3 comments and the comments counts - and also if the current user has liked, viewed or commented on that photo
[14:47:54] <Bartzy> and I show 50 of these photos in the index page
[14:47:57] <NodeX> so you only really need the last 3 then?
[14:48:01] <Derick> Bartzy: you can call bsonsize() on it on the shell
[14:48:05] <Bartzy> When a user clicks on the photo, I need all.
[14:49:18] <Derick> Bartzy: denormalize that data out - store all the likes in one cal, but only the last three and the additional info with a document - it's ok to have to do two updates..
[14:49:41] <Bartzy> what if a user deletes a comment then
[14:49:53] <Bartzy> I delete it in both the photo document and in the comments collection ?
[14:50:07] <Bartzy> comment=like=views , kinda. so I mix and match those words
[14:50:07] <NodeX> store the meta data inside it's own collection
[14:53:55] <Bartzy> NodeX: So when I need all the data for a specific photo I do db.metadata.find({_id:photo_id}) ... How do I differentiate in my app code between comment and view and like ? Just loop through the results and fill an array of comments when type=comment, likes when type=likes ?
[14:53:55] <NodeX> if that's what your app requires then yes
[14:54:03] <NodeX> else query id= foo and type=bar
[14:59:54] <NodeX> In my own personal experience I used to store references and meta data to images as nested arrays inside a "galleries" collection, it seemed a good idea at the time until I wanted to do more with the data, I changed it very quickly to it's own collection
[15:02:58] <Bartzy> NodeX: Each member is very small - uid and name string.
[15:03:23] <Bartzy> NodeX: Why can't sort or order very well ?
[15:03:45] <Bartzy> I can also not get the entire "likes" array in the query. Just get the likes counter when needed.
[15:03:59] <NodeX> because they're nested elements
[15:04:13] <NodeX> store the counter with the document
[15:04:25] <NodeX> best thing is to do it the way you think is best
[15:05:49] <Bartzy> I'm just designing it right now, we started to code it, and hit some "thinking" walls :p
[15:06:09] <Bartzy> We actually do need to get all the likes / comments / views array, because we need to know if the user is in that array.
[15:06:36] <Bartzy> in order to solve it in a prettier way - we need a users collection where each user has a "likes", "comments", "views" array, right?
[15:07:19] <NodeX> but that is one way to solve it
[15:07:52] <NodeX> but you still lose query ability, so it's best to have likes,views,comments collections (or one for all)
[15:08:28] <Bartzy> NodeX: But with a likes collections, if I show 50 photos in a page and want to know for each one if the user liked any of them (to show it in the UI) - I need to do 50 queries
[15:09:04] <Bartzy> db.metadata({_id:'photo-id-1', uid:12345}).... for each photo-id in the page I'm showing
[19:46:41] <jaha> whats the best way to store html in a mongo doc?
[19:50:18] <ra1n3r> hi. noob here. i'm trying to use the java driver for mongodb. but i'm not exactly sure what to do with the jar file i just downloaded... how do i add it to my classpath?
[19:50:52] <ra1n3r> if i do echo $CLASSPATH theres no output
[19:54:36] <ron> have you written any java code before?
[19:56:36] <ra1n3r> so it means current directory, but its still not working when i'm trying to use mongo in java
[19:56:43] <ra1n3r> its been a while since i used java, rusty...
[19:56:43] <ron> okay, so add it to your classpath.
[20:08:32] <bichonfrise74> I'm using 2.0.6 version, and I wanted to change the oplog size through /etc/mongodb.conf (oplogSize = 50000)… but when I do show dbs, the local collection shows 1 GB… is this correct?
[20:08:50] <bichonfrise74> I was expecting to see 50GB in the local collection
[20:11:13] <bichonfrise74> but the mongodb.log shows 50GB for the oplog size
[20:42:02] <ra1n3r> having trouble with java driver for mongo. when i just do a simple Mongo connection = new Mongo("localhost", 27017);
[20:42:08] <ra1n3r> I get the error "Default constructor cannot handle exception type UnknownHostException thrown by implicit super constructor. Must define an explicit constructor"
[23:08:41] <vsmatck> I read some blurb a while ago about that possibly changing in the future.
[23:09:50] <vsmatck> I wonder if anyone has thought about out of order responses. The mongo protocol contains enough information to support that now.
[23:11:44] <vsmatck> Seems like if you moved from one thread per connection, to 1 network thread and a pool of workers, then it would allow you to get more performance by supporting out of order responses.
[23:14:38] <vsmatck> If you had two pipelined requests where the first would cause a pagefault, and the second would be served from memory then the second fast request would get stuck behind the first slow one.
[23:17:09] <vsmatck> Seems like the queue the pool of workers would pull from could be protected by a reader/writer lock.
[23:17:56] <vsmatck> NVM on that last thing. Makes no sense.
[23:19:24] <vsmatck> Or maybe it does. Each queue element could contain a lock function which would either read lock or write lock depending on the job.
[23:19:50] <vsmatck> That would insure fairness. Or that only one worker gets released when there is write job.