[04:33:12] <jwilliams> when using skip, limit function, is it possible the docs read size would be smaller than limit specified? for example, spcifying limit 10000, but only 5000 result are actually read.
[04:34:14] <siva> Thinking to create a skill matrix model using mongodb any ideas?
[04:40:01] <ron> siva: data modeling can change between different nosql database. normally, you should model your data to make querying easier.
[04:42:15] <siva> Ron, Thanks. I'm just planning to use mongodb as it seems to be easier and it can work with C#
[04:42:52] <ron> siva: well, I can rant about the usage of C#, but I'll give you a pass on it ;)
[04:43:30] <siva> :( Example, I would just go to have a model Person->Skills->Technical->Competency Level
[07:41:25] <skython> Hi, is it possible and a good idea to deliver mongodb bundled with a node.js application? I'm looking for a way to build a ready-to-use installation for my app which is using mongodb
[07:52:47] <wereHamster> skython: make a server image, complete with mongodb, node and all other dependencies
[07:54:04] <skython> You mean an image for a virtual maschine ? I would prefer a native way of installation for each platform
[09:52:01] <gigo1980> hi i have an problem i have some documents with microtime (unixtimestamp), how can i add to each document an date ? entry. is there an query posible, that can do this
[09:53:04] <mids> without journaling data is only written to disk every 60 seconds
[09:53:38] <W0rmDrink> so mids - ext4 full data journaling wont journal mmap updates is the assumption ?
[09:53:57] <calvinfo> if I am missing a datafile, is there any way to force mongo to continue without it and boot anyway?
[10:50:37] <W0rmDrink> When mongodb updates data files - it does this by just changging the mmaped location ?
[12:01:27] <jondot> hi all, anyone knows how best to handle using unicorn (ruby) and mongodb? the sample https://github.com/mongodb/mongo-ruby-driver/blob/master/test/load/unicorn/unicorn.rb.template encourages sharing one global variable
[12:22:01] <skot> W0rmDrink: no, it just updates the data in memory
[12:22:22] <skot> the os flushes from mmap virtual addresses to files.
[12:22:33] <W0rmDrink> skot, thanks, actually saw good explanation in video on journaling
[12:27:20] <W0rmDrink> it was just the guy said the journal is written before DB - and was not sure how exactly they manage that - since reads are llike read commited - but he soon after explained they mmap the data files 2x
[12:30:25] <FerchoDB> Hi. What is the best way to query the union of an array element of a collection of documents? I mean, I have a collection of documents and each one has an array of Something. I want to query all the "Someghings", but I don't want an array of arrays, but just one array of "Somethings"
[12:36:56] <skot> Might be more clear with an example. Can you post samples to gist/patebin/etc?
[12:37:52] <skot> I think you want $all/in probably: http://www.mongodb.org/display/DOCS/Advanced%20Queries#AdvancedQueries-%24all
[12:40:55] <FerchoDB> You're right, I wasn't clear. I have a collection of Customers. Each Customer document has an array of Orders. I need to query all orders, but instead of having an array of arrays of orders, I just need one array containing all the orders
[12:51:18] <NodeX> that's an appside problem unless you want to map/reduce
[12:53:50] <FerchoDB> that's what I thought. I'm still with 2.0.6, but maybe the "MongoDB's New Aggregation Framework" in 2.1 will also do the trick
[13:00:42] <augustl> considering to use mongodb queries with regexps instead of a full text search engine. But that means adding indexes for all fields present in my documents, and the documents typically have many different fields. Specifically, there are groups of documents with similar fields, but the fields differ a lot from group to group
[13:00:49] <augustl> should I just use a full text engine? :)
[13:02:26] <augustl> we're talking potentially hundreds of indexes on a single collection
[13:02:34] <deoxxa> augustl: regexes are going to be craaaaazy slow if you have a more than insignificant number of documents
[13:02:59] <deoxxa> i use elasticsearch with mongodb - it's a pretty good fit
[13:03:08] <deoxxa> but there's plenty of options out there
[13:03:21] <augustl> deoxxa: I'll have hundreds or tops thousands of documents to search
[13:03:37] <augustl> that is, I'll have a gazillion documents in the collection, but the searches will always be scoped on a "foreign key"
[13:03:59] <augustl> so assuming the query planner in mongo is clever enough to filter on that "foreign key" first, it'll be relatively few documents
[13:04:16] <deoxxa> .explain() will tell you whether it is or not
[13:04:21] <mw44118> hi -- i need help understand replica sets
[13:04:49] <mw44118> i set up a replica set with one primary and one secondary. then the amazon outage killed my primary, so now my secondary is the new primary
[13:05:31] <mw44118> here is my problem: how do i configure my app that talks to mongo so that the app keeps working even after an election?
[13:05:45] <mw44118> right now, my app talks directly to the primary mongo
[13:06:25] <augustl> deoxxa: also just realized that indexes are pretty much irrelevant if it does a full regexp compare, obviously :)
[13:06:26] <mw44118> I think my app needs to talk to some kind of load-balancing thingy that
[13:07:08] <augustl> deoxxa: elastisearch looks good, thanks for the suggestion
[13:07:28] <deoxxa> yeah, it's really easy to get running
[13:07:40] <deoxxa> it eats quite a bit of ram, but is pretty quick if you can throw hardware at it
[13:08:19] <augustl> throwing hardware is both easy and cheap :)
[13:09:21] <deoxxa> yep, much cheaper than spending a couple of weeks restructuring things to work with slightly less hardware
[13:10:52] <mw44118> should i run mongos on a different box than my primary replica?
[14:10:39] <arkban> mw44118: think of it like your load balancer, if your primary box disappears and it has mongos, you can't connect to the secondaries
[14:32:20] <Bartzy> What book should I read (a complete novice on MongoDB, need to evaluate it for our production (millions of users) app in the company):
[14:45:50] <kchodorow_> Bartzy: TDG is fine for general intro stuff, but it was written before sharding & replica sets were released
[14:45:51] <NodeX> db.places.find( {$or : [ { City : /ondon/ } , { City : /b/ } ] } )
[14:45:57] <kchodorow_> so those sections are a little thin/out of date
[14:46:18] <FerchoDB> thanks, I already tested it and worked
[14:46:51] <NodeX> it will work but it is very very very inefficient
[14:47:00] <NodeX> first and foremost because you dont have a prefix
[14:47:34] <Bartzy> kchodorow_: Thanks a lot for the info.
[14:48:05] <Bartzy> I have some design question. i.e. is mongo is a good fit, before I dive in into it. Hope it's fine to ask that kind of questions here.
[14:49:05] <Bartzy> I'm part of a team that is developing a pinterest-like feature from its application photos (user-created photos). Each "pin" has likes, views, and comments
[14:49:33] <Bartzy> each pin has view counts, and if you click on it, it also shows the viewers list, a comments count, and the last 3 comments (if you click you get all of them).. same for likes
[14:49:38] <Bartzy> It's really like pinterest.com
[14:49:49] <FerchoDB> thanks NodeX, yes I know it's going to be quite inefficient
[14:50:10] <Bartzy> We currently have around 150 million pins
[14:50:32] <Bartzy> We actually have 500 million "pins" (photos), but only 150 million of them are shared with the world, and those are the ones we will show on the pinterest-like feature.
[14:50:46] <Bartzy> Do you think MongoDB is suitable for this ?
[14:51:16] <FerchoDB> I've heard that foursquare runs on MongoDB, is that true?
[14:53:00] <FerchoDB> yes, apparently they do use mongodb
[15:03:04] <Bartzy> NodeX: So when should one use a RDBMS such as MySQL instead of something like Mongo? Only for transactional stuff like financing and payments?
[15:03:43] <NodeX> transactions are not unique to databases or RDBMS'
[15:10:27] <Bartzy> NodeX: So.. What when should you generally use a RDBMS ?
[15:56:17] <devastor> Hi all, I got some "DR102 too much data written uncommitted" warnings and backtraces after initial sync when it was doing the replSet initialSyncOplogApplication stuff in mongo 2.0.4. It continued and completed ok after those, though. Is there a risk that some data didn't get written properly or anything like that?
[16:02:56] <Baribal> neil__g, I'd be all for it. I know a few buildings which have a rather simple and easily detectable geometric shape, and whichs bombardment would further the cause of open sources.
[16:06:17] <JoeyJoeJo> I have a function that returns the value of a text field. When I call that function using the button next to the text field, I can get the value of the text field. If I call that function from another function I get the error "cannot get value of null". Why does that happen?
[17:19:13] <tystr> I'm trying to wrap my head around map/reduce….so I have a noob question: I need to sum some numbers in two differen collections then divide one sum by the other…
[17:19:24] <tystr> what's the proper approach to this?
[17:32:18] <kali> tystr: you perform both sum() separately, and then perform the division in your application code
[17:32:55] <tystr> kali yeah, that's probably how I'll do ti in the application
[17:33:07] <tystr> I'm just in the terminal right now, wanting to run some quick numbers
[19:22:32] <JoeyJoeJo> In pymongo I want to find all documents in a collection but db.collection.find() with no arguments returns an error. What am I doing wrong?
[19:47:34] <halcyon918> if I have an index defined like {accountId: 1, status: 1} and I look up a record but JUST "accountId" (ignoring status for now), will the index still be used, or should I have a second index for just accountId?
[19:49:26] <elarson> I'm trying to fix a race condition in an application and I'm considering using find_and_modify in the pymongo driver rather than using update. Is this a horrible idea? I'm concerned this will cause more locking or something similar that could slow down other operations
[19:51:08] <elarson> the docs don't suggest that findAndModify adds any extra locking, specifically the global lock, but I'm curious if that is really the case
[19:53:45] <Bartzy> what happens if I query something that doesn't have an index ?
[19:53:46] <Bartzy> it will just be slow, just like MySQL ?
[19:53:47] <Bartzy> also, creating an index on an existing collections means locking that collection? the entire DB ?
[20:02:19] <skot> There is a non-blocking version where you can set background index creation
[21:36:44] <JoeyJoeJo> I'm trying to do a geospacial index for a large collection. I did db.collection.ensureIndex({'loc':'2d'}) and it's just sitting there with three periods instead of a prompt. Does that mean it's working?
[21:38:28] <JoeyJoeJo> I figured it out. I was missing a closing }
[22:39:08] <dstorrs> how do people normally manage backups? We're going to have something like 5-6TB to start with, and we're adding about 10G per day
[22:40:04] <dstorrs> I'm running into issues like "S3 / CloudFiles / etc can only store 5G per object"
[22:41:25] <BurtyBB> not using S3 is part of my solution