[02:29:05] <caitp> I'd like to start and stop a dummy database and populate it with dummy data during execution of unit tests, is there any facility for doing something like this?
[02:29:26] <caitp> eg, even if the production database is currently running
[02:45:00] <crudson> caitp: just run mongod --port <some other port>
[02:45:30] <caitp> and I can prevent it from persisting once the daemon shuts down?
[02:47:11] <crudson> set --dbpath and rm -f dbpath after? or --directoryperdb and delete just the test db directory
[02:47:37] <crudson> or db.drop_database through driver after tests run
[02:49:14] <caitp> I know it's a very RTFM question :)
[02:49:36] <caitp> so thank you for answering, right on
[02:50:08] <crudson> there are a number of ways - it's really up to you
[02:50:41] <caitp> well my preferred approach would be something that I can easily set up entirely within node.js vows or mocha or something
[02:52:15] <crudson> just have a test db that gets dropped at the end of the tests
[03:09:43] <anuvrat> is there something I can do to make mongodb use more processing power and be a little faster?
[03:47:09] <crudson> anuvrat: assuming sensible queries and indexes, have enough ram for indexes and a SSD. It's a very general question though. "use more processing power" doesn't make a lot of sense unless your cpu is being taxed, in which case you can lower the "nice" priority of mongo.
[03:51:35] <caitp> it sounds like they are talking more about requesting more time from the kernel scheduler
[03:52:35] <caitp> maybe nice/renice mongod in linux
[03:56:36] <crudson> yeah, see the last bit of my comment
[03:57:56] <caitp> yeah, didn't notice :) but it sounds like they feel they are not getting enough processing time for some reason :p
[04:10:03] <anuvrat> crudson, lower priority? ... I want it to hog the system if it has to but be a little faster, why would I lower the priority?
[04:10:36] <crudson> lower nice != lower priority. I think in linux terms. Adjust for your OS.
[04:11:47] <caitp> it might not actually improve speed much, or worse it might have the opposite effect in some cases
[04:12:28] <anuvrat> okay crudson caitp I will lookup how to do that ... thanks
[04:13:23] <anuvrat> some times it so happens that I need to restart the db to perform properly ...
[04:13:38] <anuvrat> are there any common reasons why it might feel to be stuck?
[04:13:46] <crudson> anuvrat: where is your performance lacking? reads, writes, mapreduce? process priority is unlikey to be an issue, as implied in my first bit
[04:14:10] <caitp> I've heard horror stories about mongodbs randomly failing to replicate or even respond after a period of time
[04:14:30] <anuvrat> caitp, that indeed is scary ...
[04:14:49] <caitp> it keeps sysadmins employed :) or helps get them fired, take your pick :D
[04:15:47] <anuvrat> so take this incident that has just happened ... I ran a script which seemed to be stuck ... tried killing and running the script multiple times to no avail ... . restarted mongodb and ran the script again, it processed > 10k records in less than 4 minutes
[04:16:16] <anuvrat> caitp, there are no sysadmins ... its a startup ... I am the developer, I am the sysadmin ... :(
[04:21:44] <crudson> "give it more CPU time" is probably not the question you are looking for
[05:06:43] <mrapple> i have a pretty massive collection, 100 mil documents or so, and most of the queries on that collection are timestamp $gte
[05:06:55] <mrapple> if i'm picking a shard key, should it be a hashed key of the timestamp field?
[05:41:03] <jgiorgi> has something changed recently (ie last year or so) with how mongodb handles memory? i see significant improvements in memory usage
[07:52:39] <crodas> where is the right place to ask about the nodejs driver?
[15:23:02] <ron_frow_> I understand mongodb has nothing in the sense of gis, I dont need anything nuts, but it would be nice to say geocode an address and then be able to say does this exist in this county
[15:23:06] <ron_frow_> or something along those lines
[15:23:57] <ron_frow_> I guess I could look up a city in address and look that up for its county
[15:29:48] <Derick> ron_frow_: mongodb has geospatial support
[15:30:51] <Derick> it definitely can do that if you have stored the polygons for countys in MongoDB: http://maps.derickrethans.nl/?l=timezone&lat=40.20&lon=-81.24&zoom=6
[15:57:00] <Nodex> Derick : are ObjectId's indexed all the time without specifiying you wish to create one or just "_id" ?
[15:57:28] <ron> in a way, solr and lucene are the same project now. they're releases are together.
[15:57:35] <ron_frow_> I'll give it the benefit of the doubt
[15:58:02] <Nodex> + imho, now solr has better JSON mapping - Elastic search has lost a lot of ground, 1-1 mappings are almost 98% efficient between solr and mongo now
[16:32:43] <Derick> made for text, it doesn't have all the fancy update operators really
[16:34:40] <ron_frow_> so shall I just pass a subset of my mongo docs over and refer back to mongodb data via id?
[16:36:10] <Derick> i think that's what most people do
[16:36:19] <ron> what we did in the previous workplace is store very common fields in solr, but once we needed to pull whole documents, we'd get the id from solr and query mongo. it's a bit of a trial and error to find the proper balance for your application, and it can change with time depending on change in your use cases.
[16:36:29] <ron> there's no single solution to fit all.
[16:36:48] <ron_frow_> yeah well my spatial shit is going to have to fit into it... which is going to make things interesting
[16:36:48] <Derick> ron: why didn't you opt for storing everything in solr too?
[16:36:58] <ron> the combination of mongo with a FTS engine can be _very_ powerful.
[16:38:10] <ron_frow_> hmm I dont see the whole facet stuff in it
[16:38:16] <ron_frow_> I mean I see you can search fields etc
[16:38:35] <ron> Derick: that's a good question. When we started using Solr, it was in its version 3.x where it wasn't marketed as a full database (since 4.x they like calling it a nosql db as well). Obviously, the more you store, the 'heavier' it's going to be and you want to index to stay lean and fast. there's also a matter of keeping the schema up to date.
[16:38:43] <ron_frow_> I see they have a support channel... I'll move that direction
[16:38:55] <ron> in fact, in some cases, we stored data differently than we did in mongo, for internal uses.
[16:39:24] <ron> ron_frow_: keep in mind there's a difference between storing fields and indexing fields.
[16:40:15] <ron_frow_> I think you can even fetch the list of facets to search on out of solr
[16:40:20] <ron> right. forgot. solr is basically a wrapper for lucene that you can run standalone instead of embedded in your application. plust, it has clustering capabilities.
[16:40:34] <ron_frow_> eg, product price between 100-200
[16:40:45] <ron_frow_> I guess I could build that in ui really rather easily
[16:41:08] <ron> yeah, though that's not necessarily related to faceting.
[16:41:21] <ron_frow_> well they call it "faceted search"
[16:43:05] <ron_frow_> it basically almost does an aggregated group so you can see how many items are by this mfg, how many products are in this price range etc
[16:44:06] <ron> we used it to do things like you see in LinkedIn's search, were you see on the side different groupings and amounts.
[16:44:58] <ron_frow_> I guess I am just trying to see if elastic search supports this
[16:45:19] <ron> they are very similar, from what I know.
[16:45:20] <ron_frow_> I mean dont get me wrong, the hassle saved by having the already distributed
[16:45:46] <ron_frow_> is going to be huge... last time I looked at lucene that was kinda one of those... you figure it out kinda things
[16:45:53] <ron_frow_> I appreciate all the input ron / Derick
[16:46:53] <ron> sure thing. I really think it's a good combination. mongodb is awesome and has very strong capabilities, but a FTS pretty much completes it.
[16:47:13] <ron> one of the possible problems though is keeping things in sync between the two.
[16:47:55] <ron_frow_> well I can handle that in business logic layers
[16:48:01] <ron_frow_> not that big of a deal for waht I want to build
[16:49:28] <ron> right, but you need to keep in mind that if you don't manage transactions in any way, you could end up updating the database and not solr or the other way around. edge cases.
[16:51:55] <ron_frow_> ron you made the mistake of talking
[16:52:02] <ron> Derick: I admit that in some cases, I was thinking of indexing everything in solr and use a datastore (even a k/v one) just to store the json object. but like I said, it depends on the use cases and so on.
[16:52:12] <ron> DestinyAwaits: well, address the channel, not a person.
[16:54:58] <DestinyAwaits> hmm.. well it's personal that I can't discuss on the channel as it's getting logged
[16:55:57] <DestinyAwaits> ron_frow_: maybe you are right well I have to submit something which can land me a new job so it's persoanl and I can't openly discuss.. I hope you understand.. :)
[16:56:47] <ron_frow_> just seems a bit odd someone would pop on irc, pick a random persona nd ask them to review a project
[23:28:15] <euskode> Hey there, I am using the native MongoDB driver (version 1.3.11) against two different MongoDB databases, one running 2.4.5-rc0 and the other one on 2.4.5. I am seeing different behavior when running a simply query (col.find('_id':{'$in': arrayOfObjectIDs})) against each of the databases and toArray-ing on the cursor that returns; basically, 2.4.5-rc0 behaves as expected, whereas 2.4.5 does not. I want to make sure that this is inde
[23:28:48] <euskode> ing, do you guys know if anything related to find-ing or interacting with cursors has changed significantly enough between 2.4.5-rc0 and 2.4.5 to break this?