[00:08:04] <Zelest> i wish to store things like "when was this domain last poked" in order to avoid hammering the same domain/server
[00:08:18] <Zelest> as well as, does the domain use www and/or is this url ssl
[00:08:27] <Zelest> i bet i can store all that, but it feels like a lot of duplicate data
[02:31:03] <owen1> using mongo with node. i get: "A Server or ReplSet instance cannot be shared across multiple Db instances". here is some of my code - https://github.com/oren/Y-U-NO-BIG/blob/master/food.js#L47
[02:31:26] <owen1> it happens only on the second request.
[05:49:04] <Loch> Can anyone tell me the performance differences between a standard HDD and an SSD? Say I'm reading 1 row from 100000
[05:49:29] <Loch> Will the difference be noticeable to the end user?
[06:54:19] <Kage`> Anyone know of any PHP+MongoDB forums systems?
[09:07:09] <BadDesign> Why I receive the following error: "assertion: 9997 auth failed: { errmsg: "auth fails", ok: 0.0 }" even though my password and username are correct when I try to import a CSV file using mongoimport?
[09:08:00] <BadDesign> Do I need to enable some kind of remote access to my database, given that it is hosted on a MongoDB provider?
[09:11:57] <BadDesign> The command I'm using is: mongoimport -h myhostname:myport -d mydatabase -c mycollection -u myuser -p mypassword --file myfile.csv --type csv --headerline
[09:15:41] <BadDesign> carsten_: that command will run the mongodb daemon, in my case the MongoDB server is hosted remotely on MongoLabs, and I want to import some data into it
[09:16:02] <BadDesign> I don't think I need to run another daemon locally for what I'm trying to do
[09:16:33] <carsten_> check with your hosting provider - i am not jesus in order to know what your provider is doing
[09:20:07] <skot> BadDesign: please also show that the same username/password work via mongo (javascript shell)
[09:20:34] <skot> (also, what version is this of mongodump)
[09:21:55] <BadDesign> skot: I crated a new user on the database and it worked that way, I don't know why it doesn't let me use the already present user that is the same as the credentials for MongoLabs, nvm
[09:22:36] <skot> There is a difference between a user in the db and an admin user. The one mongolabs gave you was an admin user
[09:22:53] <skot> mongoimport only works with db users currently
[09:23:40] <BadDesign> skot: ok, thanks for the information
[10:27:10] <jwilliams_> i come across to read that python has lib like ming which can lazily help migrate schema. is there any similar tool in scala/ java for mongodb?
[11:29:07] <Fox`> hey, does anyone know how frequent the workshop conferences are in london?
[11:32:17] <Derick> Fox`: you mean "office hours" ?
[11:34:33] <skot> I think he means the workshops before the mongodb uk conferences
[11:35:05] <skot> Fox`: conferences in the UK happen once a year basically.
[14:15:13] <tubbo> i'm trying to write a conversion script to get data out of mongodb and into postgresql, using ruby on rails and db migrations
[14:15:42] <tubbo> the last step that i need is to be able to convert mongo data into CSV, and then import that CSV into the ready-made postgres database (which i have already mapped all of the former mongo models to)
[14:17:58] <tubbo> when i ran the script (which basically runs mongoexport a bunch of times), i got this lovely error
[14:18:01] <tubbo> "Invalid BSON object type for CSV output: 10"
[14:18:05] <tubbo> what does that mean, and how can i remedy it?
[14:25:11] <gigo1980> hi, i have a 2 shard setup / is replicated… what can i do if shard A has 10GB and shard B has 100 GB ?
[14:32:45] <FerchoArg> I'´m using C# driver for mongodb. Is it correct to use the attribute [BsonId] in two different properties? It's a class whose primary key depends on two "columns" or fields
[14:39:11] <skot> No, you cannot do that. It is for a single field. You can create a compound index on two fields though.
[14:39:14] <Zelest> silly question, can I have a unique constraint on multiple fields? e.g, username+age+city together must be unique?
[15:06:50] <infinitiguy> Do you need to have mongo configured in replica sets to do sharding?
[15:07:13] <skot> no, but it is highly suggested for data redundancy and reliability of your data
[15:07:18] <infinitiguy> example: http://www.mongodb.org/display/DOCS/Amazon+EC2+Quickstart#AmazonEC2Quickstart-DeployingaShardedConfiguration talks about having 9 instances - 3 per shard, whereas 2 are secondaries
[15:08:07] <infinitiguy> currently we're using xfs freezing and ec2 snapshots to backup our mongo server - would this method hold true if we just had 3 sharding servers with no replicas?
[15:08:41] <skot> yes, but during the freeze there would be no writes so probably not good if you don't have a replica set.
[15:09:50] <infinitiguy> hrm - im also wondering how sharding complicates the freeze. We do the freeze/snap at 10:15 via cron and I'm wondering what would happen if the freezes between 3 shard servers were slightly off (like a second)
[15:10:16] <skot> read the docs on a sharded backup, I can lookup the link if you can't find it.
[15:13:14] <opra> How do i find all documents with a field (Array) that is empty
[15:28:27] <infinitiguy> how do you actually create the configdb when configuring mongo sharding? I'm pretty new to mongo and I'm just trying to set up a simple 2 node(or 3 node) shard.
[15:28:51] <infinitiguy> I read on the http://www.mongodb.org/display/DOCS/Configuring+Sharding page that you have to run a mongod --configsvr process for each config server. If you're only testing, you can use only one config server. For production, use three.
[15:29:11] <infinitiguy> but when I do, in the logs I get: (/data/configdb) does not exist, terminating
[15:29:46] <infinitiguy> I normally start mongo by pointing to a config file - maybe I should be putting config server stuff inside of that?
[15:30:24] <skot> you need to create the database directory first, as with any instance of mongod
[15:33:08] <skot> you can use a config file but the error will be not different
[15:36:30] <scoates> kchodorow_: FWIW, I am now "26.99GB resident" thanks to your advice, and faulting a whole lot less. Thanks (yet) again.
[15:36:41] <opra> skot: do you know how can find documents where a match only matches the last element in an array
[15:38:18] <infinitiguy> I'm trying to get a shard and config server on the same box - if I were to do it with config files I'd have to have 2 separate config files for the mongod processes - correct?
[15:38:41] <infinitiguy> there's no way to specify config ports and shard/db ports to be separate within the same config?
[15:39:13] <adamcom> infinitiguy: yup - you need separate config files - one for each mongod instance you plan to run (if you wish to use config files)
[15:48:23] <skot> zirpu: it mostly depends on your disk/mem
[15:48:34] <skot> opra: depends on your doc, but yeah.
[15:48:55] <scoates> zirpu: a 3-node replicaset without sharding? if the nodes are all the same, a N-node replicaset without sharding should be just about the same as a 2-node replicaset without sharding (all writes go to the primary).
[15:50:16] <zirpu> scoates: ok. so i can write the the primary, but the 2ndaries can't keep up.
[15:50:42] <FerchoArg> I noticed that using the BSonId attribute, when I instantiate an object, it already has an ObjectId hash. Is there a way to distinguish objects that have been retrieved from mongodb and those that have been just instantiated with "new" ?
[15:50:44] <zirpu> i'm wondering how to figure out what the max write rate would be for this particular disk/mem combo is. it's hosted hardware.
[15:50:57] <scoates> what do you mean keep up? like, the replicas can't keep sync?
[15:51:12] <opra> is there any way i can limit the query to a specific array amount
[15:52:14] <zirpu> I turned off writes to the primary over the weekend and left them syncing. eventually one of them was killed by the kernel OOM killer. not really sure why.
[15:52:19] <scoates> zirpu: ah, maybe your oplog is too small, then?
[15:52:34] <scoates> zirpu: but if it's a constant stream of too much data, you probably need to think about sharding.
[15:52:35] <zirpu> i have a 4Gb oplog. maybe i should double that?
[15:53:22] <FerchoArg> yes, I thought that the Oid was generating at the time of saving that object into Mongo.
[15:53:27] <scoates> zirpu: do you query the secondaries?
[15:53:50] <zirpu> scoates: not yet. haven't loaded enough data to start working on the actual processing.
[15:54:11] <scoates> zirpu: you might just be able to turn off indexing, then, if it's disk or cpu-bound (not network-bound)
[15:54:54] <scoates> we have a 12h delayed slave on one of our web nodes just in case we do something *really* stupid, and I turned off indexing because we will hopefully *never* have to query it; that helped CPU and flush.
[15:54:56] <zirpu> oh that makes sense. before populating the collections i created the indexes.
[15:54:59] <FerchoArg> I eventually want to save that object, but there is a method that needs to know if it came from db or if it has jus been created. It's not a big deal, I can use an auxiliar var, but I wanted to know if there was some way to figure that out from the instance
[15:55:25] <zirpu> scoates: is there an admin command to turn off indexing? or do i just drop the indicies?
[15:56:09] <scoates> I'm on a train with questionable wifi right now, but it's definitely in the docs; pretty sure under rs config
[15:56:59] <scoates> IIRC you have to remove the node from the set and re-add it with createIndexes = false
[16:00:12] <zirpu> scoates: thanks. ii'm heading to bart myself now.
[16:03:46] <adamcom> zirpu: on the OOM killer front (from a while ago) you should configure some swap - give the OS some room to maneuver: http://www.mongodb.org/display/DOCS/The+Linux+Out+of+Memory+OOM+Killer
[16:03:56] <adamcom> and in case you are worries about having data in swap:
[16:17:34] <mitsuhiko> so basically, not really a way to do that without causing major disruptions on the environment
[16:18:02] <skot> it depends, are the names staying the same?
[16:18:33] <skot> if you started up with ip addresses then it is a big deal, if you used aliases or names then it is more manageable
[16:18:49] <mitsuhiko> skot: no, we thankfully use dns
[16:19:58] <skot> then you just need to bring up a new instance and point dns to it, and make sure it has all the data there already as config server don't replication from scratch
[16:40:56] <iamchrisf> anyone know when jira.mongodb.org is coming back online?
[17:08:12] <jstout24> meaning, i have an action and want to associate any data to it.. ie, `db.events.insert({ name: 'impression', data: { visitor: { $id: "…" }, template: { $id: "…" }, some_other_dynamic_field_we_want_to_track: { $id: "…" } });`
[17:42:27] <stefancrs> I'm running mongodb under ubuntu in vmware in windows... If for some reason, the host OS (windows) shuts down, mongodb won't get a clean shutdown and will leave it's lock file and I should run a mongod --repair
[17:43:00] <stefancrs> How do I automate that process so that everything starts up again? The virtual machine (ubuntu) starts up properly when windows boots, should I just always remove the lock file and do a --repair before starting the service upon boot?
[17:43:17] <stefancrs> (this is for a system that only will be used a few days)
[17:45:02] <tubbo> how do i apply the results of a mongodump to my database
[17:49:47] <stefancrs> hm, maybe I could just enable journaling...
[17:54:21] <Mikero> Hey, how would one create a relation like this (http://mikero.nl/gtkgrab/caps/cbb1bc.png) in CouchDB? I'm currently have a character document with the juntion table "characters_skill" embedded with a reference to the actual "skill" table.
[18:02:24] <stefancrs> Mikero: for what it's worth, what you currently do is probably fine. What's the purpose of the reference and the separate skills collection?
[18:02:43] <personal> Hi all, I have a question on a data structure, was wondering if I could get any recommendations… I'm new to mongodb and don't know what to do without a subquery.
[18:04:01] <stefancrs> CarstenLunch: the question is fairly broad... :)
[18:04:25] <personal> Cartsen: Is there any sort of "Contains" equivalent? -- I can give you a specific example and what I was thinking to make it quick.
[18:04:28] <stefancrs> personal: that totally depends on what data you needs to store, what you need to query, and how you store it,.
[18:04:51] <CarstenLunch> personal: please no high-level blather - please ask something specific one can answer...
[18:05:03] <stefancrs> personal: quick answer is yes, but the answer is useless to your specific case
[18:06:48] <personal> Let's say I'm making a Recipe book. I've got 12,000 recipes. 1,000 unique ingredients. I want to be able say "Find all recipes that contain only ingredients x, y, z" -- Currently I have three collections, 1 with each recipe to be made, 1 with a table full of the ingredients and their relationship to the recipe and how many parts of each go in, and 1 with a unique list of ingredients.
[18:07:29] <stefancrs> personal: do you need the second collection there?
[18:07:42] <Mikero> stefancrs: The reference to the skills collection is to get the name and description of the skill it's the same for everyone, the experience is not. So to avoid duplication I'd rather put it in a seperate table.
[18:08:06] <stefancrs> personal: what for? you can do queries like db.recipes.find({"ingredients" : "fish"})
[18:08:28] <stefancrs> Mikero: go for duplication instead
[18:08:37] <stefancrs> Mikero: store data in the way you want to query it
[18:09:07] <personal> stefan, Let's say someone has inputted EVERY ingredient that they have in their kitchen and I want to return EVERY recipe they can make.
[18:09:36] <personal> I'm searching off of perhaps hundreds of ingredients at a time.
[18:12:58] <personal> Select all recipes from the database that contains only eggs, sugar, milk, cheese, salt, pepper, bananas, tomatos, yogurt, spinach, lettuce, white bread, wheat bread….
[18:14:12] <Mikero> stefancrs: Thanks, I'll do that. I'm still thinking in the RDBMS ways of doing things. I hope I'll just "get it" after a week or so, ha. Thanks again for the help, you might see me again later this week. :p
[18:15:39] <infinitiguy> If I'm using a mongo keyfile for authentication - do I just specify keyfile = /path/to/keyfile in my config
[18:15:55] <infinitiguy> or do i need something like keyfile = true and keyfilepath = /path/to/keyfile
[18:19:02] <stefancrs> personal: actually I don't know how to do that from the top of my head in any db...
[18:19:08] <personal> I guess I don't think I fully understand the power of embedded documents for relationships. I'm not seeing a way to filter out recipes that have ingredients that I don't have. It could just be my brain is fried atm, lol. I'm sorry.
[18:19:16] <stefancrs> personal: you want to exclude all recipes that have any ingredient that is NOT in the list
[18:25:52] <stefancrs> I wouldn't know how to do it in SQL either
[18:28:19] <zirpu> is there a way to determine if and how far behind replicas are from the primary? is it just the difference in optime.t between the primary and secondaries?
[18:30:01] <dgottlieb> zirpu: I believe the difference in optime is the only real metric
[18:30:34] <zirpu> good enough. i'm just looking for a way to backoff loading data into a replicaset when it gets too far behind.
[18:33:40] <personal> SELECT * FROM recipes WHERE (SELECT count() FROM recipe_ingredients WHERE recipe_ingredients.recipe_id=recipe.id AND ingredient IN list) = (SELECT count() FROM recipe_ingredients WHERE recipe_ingredients.recipe_id=recipe.id)
[18:35:29] <personal> Stefan, that's an example of a way to do it in SQL. You have to count all the ingredients that are in the recipe that are also in the list, and compare that to the total. That's probably the easiest way to check for exclusion of unknown ingredients that are missing (that I can think of)
[18:36:19] <kchodorow_> iamchrisf: should be up again
[18:36:37] <personal> But my brain is freaking fried trying to find an equivalent in mongo, >_>
[19:02:03] <dijonyummy> what do you guys like most about mongodb? is the the easy programming object paradigm, no need orm, or sql, statement, open close/result sets, etc? or the scalability?
[19:03:10] <stefancrs> dijonyummy: what's the point of your question? for me its the ease of progrmming using something that is very "javascripty", that it's schemaless and the performance
[19:03:36] <multi_io> how can I persist all documents that were stored in "NORMAL" WriteConcern, i.e. in mamory only?
[19:04:00] <personal> I like how easy it is to save records from python with pymongo… coll = db['collection']… coll.save(dictionary)
[19:04:03] <dijonyummy> just want to know whats best about it from a programmers perspective, whats javascripty about it? the json?
[19:04:34] <stefancrs> dijonyummy: the json, the queries and the fact that it uses javascript for certain operations....
[19:04:46] <stefancrs> dijonyummy: you can basically execute javascript code within a query
[19:05:02] <stefancrs> dijonyummy: which then is dealt with by mongo, not your client
[19:05:34] <dijonyummy> i actually like sql, but its the other stuff, skeletal code, orm, etc thats a pain, and which db syntax, oracle, postgres, etc i dont like. when doing my own personal projecgs i dont have much time. was thinking maybe nosql like mongo is great for that?
[19:05:40] <stefancrs> dijonyummy: but "what's javascripty about it" makes me think you should probably just read an introductory article about mongodb instead of asking people here
[19:06:06] <stefancrs> dijonyummy: yes mongo is maybe great for that...
[19:06:15] <stefancrs> dijonyummy: we can't answer those questions for you :)
[19:34:54] <infinitiguy> when doing mongo sharding how are application servers configured to talk to the DB? Should it be talking to whatever box runs mongos?
[19:35:00] <infinitiguy> or can it talk to any of the mongo nodes?
[19:35:16] <infinitiguy> im trying to figure out how I go from 1 mongo server to many mongo servers on the app side
[19:38:13] <Derick> infinitiguy: they need to talk to mongos
[20:39:20] <macabre> sirpengi: from the 3 different mongo drivers ive found online it seems like each one limits django in one way or another.. sessions, admin, orm, is this true?
[20:39:48] <macabre> its a general question as im just now starting to do some research
[20:41:05] <infinitiguy> can you change chunk size after it's established?
[20:41:09] <infinitiguy> im testing with a 1MB chunk
[20:41:14] <infinitiguy> and I want to move to a larger chunk size
[20:41:30] <sirpengi> oh, when I use django with mongodb it's only for custom parts
[20:41:42] <sirpengi> the auth and session I keep in some rdbms
[20:41:53] <sirpengi> so it's django+postgres+mongo
[20:42:56] <sirpengi> the django orm is tied to rdbms world, so the builtin session/admin stuff needs to be too
[20:43:16] <sirpengi> I think there's a project somewhere to enable it for nosql solutions, but I don't believe any of them are production stable
[20:43:27] <sirpengi> (or support the full features of the current orm)
[20:48:15] <infinitiguy> anyone have any thoughts on being able to change the chunk size? I over-rode the default of 64m and set it as 1M. I'd like to put it back to 64m. Can I just remove the specification in the config file and restart mongos?
[22:11:38] <skot> infinitiguy: that setting doesn't have any impact once the initial startup.
[22:12:26] <skot> What docs are in the config db and settings collection? db.settings.find() in the config db connected via mongos.
[23:45:25] <Goopyo> since keys cant contain periods '.', is there a way to convert to replace that . with another character on the connection level?
[23:45:48] <Goopyo> i.e. not have to go back and modify all queries
[23:54:54] <dstorrs> Goopyo: _id.username is a perfectly legal key if your obj is (e.g.) { _id : { username : 'bob', user_id : 7 } ... }