PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 28th of August, 2012

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:06:58] <bluethundr> _johnny: ok thank you
[00:50:14] <mkmkmk> can i upgrade a single secondary to 2.2 to try out the aggregation framework (if i run it with slaveOk) or do i need to upgrade the whole deployment first
[00:56:48] <Yiq> So whats a typical relation case and a nonrelational? maybe im stupid but i have a hard time understanding exactly what relational means?
[01:13:31] <awpti> Howdy folks - I've taken up the task of learning how to use MongoDB, but I do have some questions on general performance: In what situations does MongoDB start bottlenecking?
[01:22:32] <Init--WithStyle-> hey guys... was hoping you could help me figure out how to do a range query for a [0,0] value
[01:22:39] <Init--WithStyle-> I have a value "loc"
[01:22:44] <Init--WithStyle-> it's an x,y coordinate
[01:23:01] <Init--WithStyle-> I want to do a range query between [0,0] and [500,500]
[01:23:07] <Init--WithStyle-> the "box" between that area
[01:24:01] <Init--WithStyle-> My index is a geospatial index..
[01:42:01] <crudson> http://www.mongodb.org/display/DOCS/Geospatial+Indexing/#GeospatialIndexing-BoundsQueries
[01:54:11] <bluethundr> hey according to phpinfo I'm definitely using the right php.ini yet for some reason even tho I've added extension=mongo.so to the ini file mongo still does not show up in phpinfo.. anyone have any ideas?
[07:39:56] <[AD]Turbo> hi there
[08:34:35] <Ceeram> with db.collection.update( criteria, objNew, upsert, multi ), examples for critera show { name:"Joe" } like, but can criteria be used as {_id:"mongoidhere"} as well?
[08:34:49] <emocakes> afaik yes Ceeram
[08:35:00] <emocakes> theres a video I watched the other night, hold on
[08:35:15] <emocakes> http://www.youtube.com/watch?v=PIWVFUtBV1Q&feature=related
[08:35:48] <Ceeram> emocakes: i dont need to learn whole mongo, just need yes/no on that
[08:36:30] <NodeX> What are you trying to achieve?
[08:36:30] <emocakes> yes
[08:36:44] <emocakes> he wants to update a document finding it with its _id:
[08:37:09] <NodeX> if you're using or finding by _id be sure to do this ... _id : ObjectId("YOUR_ID")
[08:37:57] <Ceeram> NodeX: im not trying to find, just want to show similarities between REST http PUT method and mongoDb update
[08:38:18] <NodeX> find/update/remove .. it's ALL the same
[08:39:52] <Ceeram> so i can conclude _id can be used as sole criteria for update()
[08:40:34] <NodeX> yes
[08:40:45] <NodeX> but make sure you use OBjectId();
[08:40:56] <NodeX> ObjectId() *
[08:40:57] <Ceeram> and without modifiers it will replace the whole document
[08:41:02] <NodeX> yes
[08:41:08] <Ceeram> NodeX: all i needed to know thx
[08:41:19] <NodeX> $set : ... will set parts, and without it will overwrite everythign
[08:41:21] <Ceeram> its conceptual only
[08:41:37] <Ceeram> right, exactly as i thought
[08:42:17] <Ceeram> that makes it easier for me to explain when to use PUT and POST on REST for those that are familiar with mongo
[08:42:47] <Ceeram> as what we just described, is exactly as PUT would/should do
[08:44:02] <NodeX> maybe
[08:44:34] <NodeX> I see PUT as more of an object / file than anythign but that's personal choice
[08:44:42] <Ceeram> PUT /resource/YOUR_ID is like mongo update(_id : ObjectId("YOUR_ID"))
[08:45:08] <NodeX> that's your choice, I would not do it like that ;)
[08:45:09] <Ceeram> NodeX: its to explain the concept
[08:45:32] <Ceeram> its not about doing it that way, its about explaining what it does
[08:45:52] <NodeX> good luck ;)
[08:46:34] <Ceeram> anyway thanks for the answer
[09:06:13] <gigo1980> hi all, i have some mongos instances running on application nodes, if i make an change on the config servers (moveprimary of an database) the mongos instances does not regocgnize this
[10:31:27] <yati> Hi. suppose I have a document like {"foo": ["bar", "baz"]}. Now I want to append to that list only if that particular item does not exist already in the list. Is there a way to do that?
[10:32:29] <yati> I mean, appending "baz" to that list should do nothing as it already exists, but appending "hello" should append to it
[10:32:41] <NodeX> $addToSet
[10:32:49] <NodeX> (adds if not there) ;)
[10:47:25] <yati> NodeX: Thanks a ton :) I really need to get my operators right
[10:50:40] <NodeX> no probs
[11:00:48] <NodeX> ;)
[11:11:32] <Pandabear> Hi everyone
[11:12:26] <NodeX> Hi
[11:14:00] <Pandabear> I've got a burning question. I've just noticed that our backup mongodb server in master/slave replication is not working
[11:14:45] <Pandabear> assertion code: 14031 "Can't take a write lock while out of disk space"
[11:14:59] <Pandabear> But there is plenty of disk space
[11:15:09] <Pandabear> I've stop mongodb and restarted it
[11:15:23] <Pandabear> now when I try to resync it gives me the following error
[11:16:05] <Pandabear> "code" : 13294, errmsg" : "exception: caught boost exception: boost::filesystem::remove: Permission denied: \"/var/lib/mongodb/database.ns\" db/pdfile.cpp 2151",
[11:16:13] <Pandabear> I'm using mongodb 2.0.4
[11:16:47] <Pandabear> Does anyone have a clue how to fix this, I've searched the web and docs already
[11:18:40] <manveru> Pandabear: what are the permissions on /var/lib/mongodb ?
[11:23:16] <Pandabear> manveru: read/write on owner only
[11:24:51] <manveru> and what user executes mongo?
[11:26:32] <emocakes> depends on what you choose :p
[11:27:20] <Pandabear> I've fixed it already. It was the permissions on the parent folder, it was root user not mongodb
[11:27:30] <Pandabear> Thanks for the help manveru
[11:52:30] <yati> I am using pymongo(I did not find a dedicated channel for pymongo, so asking here) - does using db.add_son_manipulator() affect the underlying database in any way? Can it be safely called multiple times? (as in running my app multiple times)
[12:48:14] <brahman> Hi. My cfg servers time is out of synch. Has anybody had any luck fixing the time skew without restarting the whole clusteR?
[13:10:03] <Clex> Hi. The documentation says "DELETE /databases/{database}/collections/{collection}/{_id}" to remove a document with the rest API. But the code does not seem to handle it: http://pastie.org/pastes/4603540/text . What did I miss?
[13:22:51] <NodeX> can you paste the endpoint please
[13:23:04] <NodeX> pastebin ... that paste does not show the query
[13:25:20] <Clex> Found what's wrong.
[13:25:22] <Clex> The mongod process includes a simple REST interface, with no support for insert/update/remove operations, as a convenience – it is generally used for monitoring/alerting scripts or administrative tasks. For full REST capabilities we recommend using an external tool such as Sleepy.Mongoose.
[13:39:05] <Gargoyle> Hey there. Anyone know if you can swap to the 10get packages under ubuntu easy? (Is it just a case of un-install ubuntu package, add 10gen repo and install mongo-10gen? Does the data get preserved, or do you need to dump and restore?)
[13:39:44] <algernon> the data location may be different between the two, but otherwise it should be easily swappable
[13:40:42] <Gargoyle> Can anyone confirm the data location for the 10gen package?
[13:41:09] <Gargoyle> And any known gotchas going from 2.0.4 to 2.0.7 ?
[13:41:20] <algernon> appears to be /var/lib/mongodb for mongodb-10gen.
[13:42:03] <Gargoyle> algernon: Seems to be that for the ubuntu package too! :)
[13:44:45] <algernon> you should be good to go then
[13:45:55] <Gargoyle> Will give it a bash later. running replica set so shouldn't be too much of a hassle to recover a slave if it all goes wrong
[14:15:26] <richwol> I've setup MongoDB on Amazon EC2 and plan to store the database on an EBS volume. I created a 3GB EBS volume, mounted it and specified for Mongo to use it as the data directory. When launching Mongo it starts eating up the disk space (I can see the file j._0 has used up all 3GB. When I then try to run any sort of query it complains that it can't because there's no disk space. Any tips?
[14:17:11] <BurtyB> richwol, --noprealloc will probably help
[14:17:42] <BurtyB> richwol, and --oplogSize too thinking about it
[14:17:54] <richwol> @BurtyB Wouldn't turning off prealloc slow down reads?
[14:18:26] <Derick> you need more space richwol
[14:18:35] <Derick> or perhaps use --smallfiles
[14:18:44] <Derick> the journal by default is I think 1GB - per file
[14:19:08] <richwol> @Derick Do you know how much space? I couldn't find any minimum requirements.
[14:19:49] <Derick> richwol: what are you trying to do?
[14:19:55] <richwol> I'm happy to increase the size of the EBS (I don't want to sacrifice performance.. just unsure of what size I should be setting it to!)
[14:21:14] <richwol> Derick: Just a basic MongoDB setup. Quite heavy no of writes but the overall database will be fairly small (under 1GB of data)
[14:21:37] <Derick> right, but mongodb preallocates, and uses a journal
[14:21:49] <Derick> the journal alone takes up 3GB minumum (without using --smallfiles)
[14:22:04] <Derick> and data in your case probably either 3.75 or 5.75 GB
[14:22:31] <Zelest> o/
[14:22:31] <richwol> Is that 3.75GB with the journaling or was that just for the 1GB of data?
[14:22:37] <Derick> data
[14:22:50] <richwol> Wow!
[14:23:05] <Derick> first file = 256mb, second 512, third 1, fouth 2
[14:23:15] <richwol> Ah ok, gotcha!
[14:23:20] <Derick> the moment you write into the 3rd, the 4th will be created
[14:23:35] <Zelest> Why the increase in size? Why not just use multiple X Mb files? :o
[14:23:47] <Derick> Zelest: I don't know that :-)
[14:24:10] <Derick> with small files on, devide all those filesizes by 8
[14:24:20] <Derick> erm, 4
[14:24:23] <Derick> sorry :-)
[14:24:25] <Zelest> Ah
[14:24:38] <Zelest> Ugh, just biked 50km.. and now my knee is b0rked..
[14:24:50] <Zelest> Sitting in the middle of a parking lot with the mac and waiting for the ride home.. :D
[14:26:01] <kali> Zelest: it make sense actualy... if avoid having big files if your base is small, and it avoid having too many files around if your base is big
[14:26:29] <Zelest> Yeah true..
[14:26:33] <richwol> Does using —smallfiles have a big performance hit?
[14:26:44] <Gargoyle> Anyone know if there is currently a problem getting th emongo gpg key from ubuntu?
[14:37:07] <richwol> Apologies I got disconnected. If anyone replied to my question could you repost? Cheers
[14:39:03] <NodeX> [15:23:26] <@Derick> with small files on, devide all those filesizes by 8
[14:39:07] <NodeX> divide *
[14:49:40] <Derick> NodeX: bleh, I always get that one wrong :-/
[14:50:27] <NodeX> I was providing the correct way incase you didn't know ;)
[14:50:38] <Derick> it's ok
[14:55:49] <Mmike> Hello. If I'm connecting to a mongodb cluster (i have 3 servers in replication) is it possible that a client library will connect to each of them somehow, no matter i just connected to one of them?
[14:56:19] <algernon> technically, yes.
[14:56:36] <jarrod> especially if you slaveok=True
[14:56:42] <jarrod> are oyu using mongos?
[14:57:01] <Derick> Mmike: the php driver definitely connects to all of them
[14:58:57] <Mmike> So, If I have a setup that those 3 mongo servers are behind some loadbalancer (haproxy, let's say), and I do, in php $bla = new Mongo("mongodb://loadbalancer-ip", '...);
[14:59:12] <Mmike> I will actually get a connection trough the loadbalancers, but also directly to all the servers too?
[14:59:32] <jarrod> why haproxy
[14:59:33] <jarrod> and not mongos
[14:59:44] <Mmike> dunno, just stated that as an example
[15:00:10] <Mmike> I'm using haproxy for everything else, so I figured it will be ok for mongo too
[15:01:25] <Derick> no
[15:01:31] <Derick> let the driver maintain the balancing
[15:01:44] <Derick> haproxy knows nothing about what is the primary
[15:02:11] <Derick> and hence can't route write requests to the correct node of your replicaset
[15:03:10] <Mmike> But there is nasty timeout wait if one of my mongodb servers goes off line
[15:03:23] <Mmike> and haproxy detects that and pulls that node out of the pool
[15:03:38] <Derick> Mmike: with the PHP driver?
[15:03:46] <Mmike> if I let driver do the balancing it's always connecting to mongo-1, then mongo-2, then mongo-3
[15:03:47] <Mmike> yes
[15:04:01] <Derick> Mmike: luckily for you we've almost a release ready version to fix that
[15:04:17] <Derick> but, haproxy can not work, because it might redirect a write-query (insert) to a secondary
[15:04:20] <Derick> and that will never work
[15:05:51] <jarrod> i really prefer mongos for even just replication with a write master
[15:05:54] <Mmike> hm
[15:05:57] <jarrod> plus you are setup if you ever want to do sharding
[15:06:08] <Mmike> Derick, so, but, wait
[15:06:14] <Derick> wait? :)
[15:06:25] <Mmike> i'm trying to wrap my head around this :)
[15:06:36] <Derick> ok :)
[15:07:00] <Mmike> let me try something
[15:07:09] <NodeX> are your clusters replicated?
[15:07:25] <NodeX> or just the web servers replicated
[15:09:50] <Gargoyle> Migrating from Ubuntu 12.04LTS package for 2.0.4 to 10gen packeges for 2.0.7 seems to have been a flawless upgrade. (touch wood!)
[15:10:31] <NodeX> :D
[15:10:58] <Mmike> i have 3 mongodb servers in replication
[15:11:24] <Mmike> now I didn't realize that there is one master and that client library knows that so no matter to which one I connect from my php code somehow i'll always end up on the master
[15:11:50] <Mmike> now, i just set up haproxy in front of those (one frontend IP that I connect to from php, and 3 backends, pointing to three mongodb servers)
[15:11:57] <Mmike> i can read/write via the haproxy IP with no issues
[15:12:21] <Mmike> but If I block with firewall access to just one mongodb server from the php box, I can't connect to mongodbs via the haproxy ip
[15:12:32] <Mmike> and now I'd like to shoot my self in the head
[15:13:18] <NodeX> :/
[15:13:31] <NodeX> the mongos should always handle the routing as the other guys said
[15:13:51] <NodeX> just because haproxy thinks somehting is under load doesn't mean that the mongo on that server is
[15:13:57] <NodeX> only mongos knows that information
[15:14:45] <jarrod> its so easy to setup
[15:14:46] <jarrod> just do it!
[15:14:51] <jarrod> mongos is awesome
[15:15:05] <jarrod> plus, when i do my read queries, i just do slaveok=True
[15:15:06] <jarrod> and bam bam
[15:15:31] <Mmike> i need to have HA setup with mongos, is that possible?
[15:15:39] <jarrod> yes
[15:15:42] <Mmike> what if my mongos server goes down?
[15:15:47] <jarrod> ha is perfect-o
[15:16:41] <Gargoyle> I thought you only needed to bother with mongos if you were doing sharding?
[15:16:46] <jarrod> no
[15:16:53] <jarrod> it also works for having master and replication
[15:16:59] <jarrod> but sets you up to do sharding down the road
[15:17:00] <Derick> Mmike: from the 1.3 release, the PHP driver will support multiple mongos connections
[15:17:03] <jarrod> i just dont need sharding
[15:17:19] <Mmike> Derick, good to know :/ but I need it somehow today :)
[15:17:46] <Derick> you can use the master branch on github :-)
[15:17:54] <Gargoyle> jarrod: OK. But except for the easy path to sharding, it's just the same as having a norm replset config?
[15:17:56] <Mmike> jarrod, so, my three mongos, one is always master, right? And client library directs writes to the master, no matter to what instance I connect to?
[15:18:09] <jarrod> well with mongos
[15:18:20] <jarrod> you send to it and it will know the elected master
[15:19:13] <Gargoyle> jarrod: It knows that anyway.
[15:19:20] <jarrod> thats what i said
[15:19:28] <jarrod> but i dont rely on the client library to handle that
[15:19:30] <jarrod> i let mongos do it
[15:19:33] <Gargoyle> I mean, the driver does.
[15:19:36] <Gargoyle> Ahh ok.
[15:19:40] <Derick> Mmike: yes
[15:19:43] <jarrod> i dont want to specify every server in my client
[15:19:44] <Derick> it discovers the master
[15:19:52] <Derick> jarrod: you do need to specify more than one though
[15:20:04] <Derick> incase the one you have in your connection string goes down f.e.
[15:20:49] <Mmike> so, what happens if my master goes down? Other boxes get auto-promoted to be masters?
[15:20:57] <Gargoyle> I have the full 3 IP setup in the software connection string.
[15:21:11] <Derick> Mmike: yes
[15:21:12] <Gargoyle> Mmike: Elections! :) But yeah
[15:21:38] <Mmike> ok, so than there is that timeout issue if I'm relying on a client library
[15:21:53] <Mmike> Ok, will try mongos
[15:21:58] <Mmike> and ditch haproxyes
[15:22:15] <Mmike> I guess I can have mongos loadbalanced somehow? Because that's what I do with my haproxies
[15:22:43] <Mmike> Although it would be really nice that once I connect to mongoB my client library is only connecting to mongoB, and not to all the boxes in the replication setup :)
[15:22:50] <Gargoyle> mongos is very lightweight isn't it?
[15:22:53] <Derick> yes
[15:23:20] <Gargoyle> Might look at running mongos on the webservers and connecting to localhost. :)
[15:23:34] <jarrod> gargoyle++
[15:23:57] <jarrod> mongos is so lightweight
[15:23:58] <jarrod> get creative
[15:24:15] <Gargoyle> creative ehhh!?!?
[15:24:34] <jarrod> if its jailbroken, go for it
[15:25:23] <Mmike> jarrod, is mongos a service/daemon i connect to, or some additional thing I need to put on the client side?
[15:25:45] <Derick> it's a daemon
[15:25:51] <Derick> that you install on the webservers
[15:25:56] <Derick> (on each of them)
[15:28:41] <Mmike> I see
[15:28:47] <Mmike> thank you, lads, a lot :)
[15:28:52] <Vile1> Guys, can you suggest any best practices for virtualising mongoDB?
[15:29:10] <Mmike> I guess there is no harm with haproxy, but no benefit either
[15:29:48] <Mmike> because client lib has no clue about haproxy and just connects to mongod, via haproxy. but then it connects to other mongods, so haproxy is really not needed there
[15:30:15] <Vile1> At the moment I'm planning to have instances with up to 500GB HDD, 4GB RAM, 1 vCPU
[15:30:41] <NodeX> why
[15:30:49] <NodeX> mongo loves RAM, feed it ;)
[15:30:54] <Gargoyle> Vile1: You'll probably want more RAM.
[15:31:06] <Vile1> 1 mongod/instance. + separate mongos/arbiters
[15:31:17] <kali> RAM is about the only thing that matters
[15:31:24] <Vile1> NodeX: since 2.0.7 it does not seem to love it that much
[15:32:26] <Vile1> it uses just a tiny piece of what it was using before
[15:33:51] <NodeX> I dont understand why you would want to limit the scalability of somehting
[15:34:26] <NodeX> or add headaches at scale time by having to look after more machines because you only wanted a very small amount in each instance
[15:34:31] <Gargoyle> NodeX: To prevent Skynet from becoming self aware!
[15:34:37] <NodeX> well there is that!
[15:34:42] <Vile1> cause ultimately i want a lot of small chunks flowing between my physical nodes
[15:34:50] <Gargoyle> :P
[15:35:05] <Vile1> not some large bulk things that will not fit into the server
[15:35:25] <NodeX> your app will crawl to a halt
[15:35:31] <NodeX> ^^ if you use those specs
[15:36:40] <Vile1> NodeX: what would be better from your point of view - 4 machines with mongod (2 shards) 4gb ram each or one machine with 16 gb ram?
[15:37:10] <Vile1> (and i do a lot of data processing)
[15:38:24] <NodeX> in my opinion 16gb machine because it will process the data faster
[15:38:28] <NodeX> and is less to manage
[15:40:04] <Vile1> NodeX: why would it? mongo is still single-threaded
[15:40:25] <jgornick> Hey guys, is it possible to setup an instance of mongo that can be treated as a sandboxed instance of another instance which is the production data?
[15:40:41] <Vile1> I've run into the situation where there still ram available, but mongo eats up 100% CPU most of the time
[15:41:34] <NodeX> Vile1 : do what you feel is best, you ask for input and pick holes in it, so do what suits your app ;)
[15:41:40] <jgornick> I'm essentially looking to perform inserts and updates against a sandboxed version of our production data to test the commands before performing them on the production instance.
[15:42:28] <Derick> Vile1: mongod is not single threaded
[15:42:40] <Vile1> NodeX: I got your point very well. In fact i am uncomfortable with 4 GB myself also
[15:43:11] <NodeX> jgornick : just change the DB name ;)
[15:43:13] <Vile1> But I think that i'll better make more VMs (easy to add => clone)
[15:43:32] <jgornick> NodeX: What do you mean?
[15:44:42] <Vile1> jgornick: make a clone of production VM and try it. Then do on production
[15:45:04] <NodeX> ^^
[15:45:39] <jgornick> Vile1 and NodeX, however I need this sandbox to be always up to date with the current data in production.
[15:46:35] <NodeX> Mongod : 157gb of virtual memory LOLOLOLOL
[15:46:53] <NodeX> jgornick : replia set ;)
[15:47:12] <NodeX> replica *
[15:47:27] <Vile1> jgornick: ALWAYS?
[15:47:47] <Vile1> or just when you are going to try out something?
[15:47:55] <Gargoyle> jgornick: Are you wanting to sandbox the mongo install (for upgrade testing) or the database (for app testing)?
[15:49:11] <jgornick> I'm trying to have a sandboxed instance of my production database so I can emulate transactions. I would run my updates in the sandboxed database (which is synced with the production database), and if everything works, I will perform the same commits on the production database.
[15:49:46] <Vile1> jgornick: i.e. this is normal behaviour for your app?
[15:49:46] <jgornick> This way, if something happens on the sandboxed database, it doesn't affect production data.
[15:49:56] <jgornick> Vile1: This is something we need to implement.
[15:50:06] <NodeX> Use a Replica set
[15:50:08] <jgornick> We need to find some way to support transactions for mongo.
[15:50:11] <Gargoyle> jgornick: setup a second server and then look into mongodump and mongorestore
[15:50:16] <NodeX> ^^ Replica set
[15:50:19] <Vile1> each transaction is run against sandbox, then "somehow" analyzed and then applied to main?
[15:50:40] <jgornick> Vile1: In the application layer, we would apply the same commits again.
[15:50:46] <Gargoyle> jgornick: Sounds like the wrong tool for the job.
[15:50:56] <jgornick> Gargoyle: That's what I'm getting too.
[15:51:02] <Vile1> if so - then you can just run two copies of the DB
[15:51:19] <Vile1> independently
[15:51:44] <jgornick> Mongo doesn't want transaction support right? Meaning, they won't add it in the future?
[15:51:53] <Vile1> and if something breaks - then you just restore sandbox from the main
[15:52:04] <Gargoyle> jgornick: Although, you are probably thinking in a relational db way. You could probably rething the while idea of "transcations". This might also help:- http://cookbook.mongodb.org/patterns/perform-two-phase-commits/
[15:52:12] <NodeX> if you need and I mean really need transactions use somehting with transaction support
[15:52:15] <jgornick> Gargoyle: Yeah, read that over many times ;)
[15:52:55] <Vile1> with mongo
[15:52:59] <jgornick> Vile1: Please explain :)
[15:53:09] <Vile1> "intent log" of a kind
[15:53:33] <Vile1> every meaningful change goes into log collection first, then applied to DB
[15:53:43] <Vile1> if necessary - rolled back
[15:54:01] <jgornick> Vile1: You wouldn't happen to have any code for this that I could look over?
[15:54:11] <jgornick> That's a similar approach we were going to look at.
[15:54:31] <NodeX> you dont even need to implement a log and roll back, you can nest every transaction inside the document itself
[15:54:34] <Vile1> jgornick: that is not a complete solution.
[15:54:51] <Vile1> it does not have rollback for deletes :)
[15:54:51] <NodeX> you can roll to any point in time for any document then without effecting other docs
[15:55:42] <jgornick> Vile1: Still, you have anything I could look at :)
[15:56:19] <Vile1> jgornick: Sorry. NDA
[15:57:15] <jgornick> Vile1: :(
[15:57:40] <Vile1> jgornick: but you can really run two DBs side by side
[15:58:06] <NodeX> ok 3rd and final time
[15:58:09] <NodeX> RUN A REPLICA SET
[15:58:10] <jgornick> lol
[15:58:10] <Vile1> run query on first => verify consistency => reset DB to the previous state
[15:58:26] <Vile1> NodeX: will not help here
[15:58:33] <Gargoyle> NodeX: That won't help!
[15:58:38] <Vile1> I understand what he wants
[15:58:43] <Gargoyle> You can't change the replica slaves!
[15:58:46] <Vile1> or other option:
[15:58:49] <NodeX> no of course it wont help ;p;
[15:58:50] <jgornick> NodeX: I'm seeing it and have the replicata set docs open just to make sure, but I don't think it'll work for what I'm trying to do :)
[15:58:50] <NodeX> LOL
[15:58:59] <Vile1> 1. run DB on ZFS.
[15:59:10] <NodeX> you asked for a server with UPTODATE information on it, (which is what a replica set is)_
[15:59:13] <Vile1> 2. make ZFS snapshot before transaction
[15:59:43] <Vile1> 3. if transaction fails: stop mongo, revert snapshot, start mongo
[15:59:53] <jgornick> Vile1: That is some dangerous stuff there :P
[16:00:00] <Derick> ugh
[16:00:03] <jgornick> I'll just chalk this one up to wrong tool for the job.
[16:00:05] <Gargoyle> jgornick: Or admit you are outside what is sensible use case for mongo and use another db, or a hybrid solution.
[16:00:06] <NodeX> *shudder*
[16:00:19] <Derick> use a transactional database, or make sure all your updates are done to one document only
[16:00:21] <Vile1> %)
[16:00:22] <NodeX> [16:51:28] <NodeX> if you need and I mean really need transactions use somehting with transaction support
[16:01:00] <jgornick> Allrighty, thanks for talking this through guys.
[16:01:05] <jgornick> I appreciate it!
[16:03:12] <NodeX> apart from the danger of it and other transactions being blocked while things are being restarted?
[16:03:15] <Gargoyle> Vile1: Becuase if there is more than one client, any kind of stop,restore snapshot,start system will not work.
[16:04:42] <Gargoyle> I've just written code that when 1 doc is changed, hundreds of others in the same and different collections also need to be changed.
[16:04:46] <Vile1> Gargoyle: obviously there MUST be single client. or rather "transactional mongo server" application
[16:05:17] <Vile1> so, all the other client must connect to it
[16:05:30] <Gargoyle> But the end result is something that can be figured out after the fact, so if something goes wrong, I can have a cleanup script that can work out what it was supposed to be.
[16:05:35] <Vile1> otherwise some of the clients will have no transaction support
[16:06:47] <Vile1> or you can call that "transactions with database-level locking"
[16:08:27] <Vile1> I'm using this kind of approach when applying patches to VMs
[16:08:50] <Vile1> snapshot-apply-check => revert or delete snapshot
[16:09:27] <NodeX> the onyl way to truely transact is to lift all of the documents from the collection(s), put them in a transactional DB, do the work and re-join them backl
[16:10:42] <Gargoyle> NodeX: And what if something fails while you are "putting back" and only half get put back.
[16:10:59] <NodeX> then the document doesnt get Unlocked?
[16:11:04] <Vile1> NodeX: so all that's left is to write storage driver for mongodb based on transactional database?
[16:11:05] <NodeX> that's implied
[16:11:54] <Vile1> that's actually doable as well
[16:11:57] <NodeX> It's a hack, and I would never do it but it would work
[16:11:58] <Gargoyle> Anyway. It is dinner time, so I need to find my baseball bat so that I can make some scrambled eggs!
[16:12:14] <Vile1> Gargoyle: 11 past midnight for me
[16:12:32] <Vile1> but you are right about dinner time :)
[16:15:02] <alanhollis> Hi guys, I'm after a little help if anyones about and bored?
[16:15:35] <alanhollis> I'm using map reduce on a collection on 2 difference collections and combining them into one
[16:15:56] <alanhollis> I then do another map reduce on this reduced collection and inline the data out
[16:16:28] <alanhollis> unfortunately unless the first permanent collection is completely unique (i.e created with a timestamp)
[16:16:47] <alanhollis> inline seems to be doubling up the results of the reduce
[16:51:25] <jarrod> is it possible to do a bulk update
[16:51:34] <jarrod> with a list of documents that already have _ids
[16:56:46] <LesTR> hi, its normall have in one rs 2 secondary servers and no primary? : )
[17:14:36] <darq> hi. I have a collection of documents that look like this {'name':'John', 'surname':'Smith'} ? this works db.names.find({'name':{'$regex': 'some-regex', '$options':'i'}}) How can i combine name and surname?
[17:20:10] <monosym> hello, does anyone know of a good way to check for matches between two mongodb collections? the matches would be in the form of a database object string
[17:20:31] <monosym> I've found some people online suggest mapreduce, but is there another possible way to do it?
[17:21:02] <kali> how big are the two collections ?
[17:21:52] <monosym> one is around 70k items
[17:21:57] <monosym> the other is under 15k
[17:26:45] <jsmonkey123> Hi, If I have a collection called Accounts and there I got a bunch of account hashes with nested hashes as well. Accounts cointains the property "Sites" which has an array full of objects/hashes. I want to remove an hash from the array that is nested in Accounts.Sites. How do I best do this?
[17:34:26] <jarrod> jsmonkey123: pop
[17:34:53] <jarrod> {$pop: { 'Accounts.Sites': hash } }
[17:34:56] <jarrod> ?
[17:34:58] <crudson> monosym: map reduce is probably your best bet I reckon with that many docs. there are other ways, like querying the larger for values in the smaller but you'll do a lot of queries (even chopping it into small $in queries)
[17:35:52] <jarrod> how do I do a bulk update with a list of docs that have existing _ids?
[17:37:35] <monosym> crudson could i do something like this? db.documents1.find("key" : db.documents2.find({}, {"search_key": 1}));
[17:37:58] <monosym> @crudson would that possibly work?
[17:38:53] <adamcom> jarrod: db.collection.update({"_id" : {$in : [list of IDs]}}, false, true)
[17:38:55] <adamcom> http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-%24in
[17:39:15] <crudson> monosym: no you can't do that I'm afraid
[17:39:48] <monosym> crudson: why wouldnt that work? just curious
[17:40:09] <jarrod> adamcom: I have queried for many docs, updated each of them individually, and want to send that list back so that each is updated with its own data
[17:40:15] <jarrod> rather than saving each one individually
[17:40:21] <jarrod> i do not believe $in accomplishes that
[17:41:35] <adamcom> jarrod: so it's not the same update for each document?
[17:42:18] <jarrod> no, they have each been modified
[17:42:36] <jarrod> with different values
[17:42:52] <jarrod> so it is something like a bulk insert, except each one should be updated rather than created
[17:43:15] <adamcom> then……how would you expect to pass in what to update for each individual element?
[17:43:39] <adamcom> and actually, I forgot the update part to that query above, but I think the idea was clear enough
[17:43:48] <jarrod> i assume each document (that has its id) would have the key: value that has been changed
[17:44:02] <jarrod> mongo could see that the _id already exists, and update the doc
[17:44:31] <jarrod> i guess that could also be accomplished with a batch
[17:44:35] <adamcom> but you want what that update is to be different for each one? or the same, like a an increment or similar
[17:44:47] <monosym> crudson: i think it makes sense now that I would need to go with mapreduce, thanks for the advice
[17:45:09] <jarrod> basically i take my accounting data from my enterprise software, and reconcile all the changes every night
[17:45:15] <crudson> monosym: I can create an example if you like
[17:45:22] <jarrod> sometimes different things change for different records, and instead of updating each one individually (which works)
[17:45:31] <jarrod> i would rather submit all the changes with a single update
[17:46:25] <jarrod> sorry if i seem stubborn :)
[17:47:49] <adamcom> I don't think it's possible, if it doesn't fit into the update() paradigm - you can't pass in a list of operations as a batch in a single update (which is what I think you are looking for)
[17:48:02] <jarrod> ok, sorry
[17:48:07] <jarrod> do batch jobs exist?
[17:48:10] <monosym> crudson: oh yeah, an example would be totally helpful
[17:48:10] <jarrod> so i can pass in many updates?
[18:22:05] <monosym> crudson: is it cool if i take you up on that offer to produce a mapreduce example? can't really find any online particular to my situation.
[18:25:26] <crudson> monosym: sorry got a mtg starting but if you're around in a couple hrs I was in the middle of making an example
[18:28:17] <monosym> crudson: I won't be around in a couple of hours but would you be so kind as to send it in an email? m2544314@gmail.com
[18:44:33] <Gizmo_x> Hi Guys, i wanna know if i can use c# mongodb driver in asp.net mvc 4 web app hosted in shared hosting, will there be a problem with driver because db is on remote server?
[18:51:15] <Gizmo_x> Hi Guys, i wanna know if i can use c# mongodb driver in asp.net mvc 4 web app hosted in shared hosting, will there be a problem with driver because db is on remote server?
[18:58:33] <Gizmo_x> can we include c# driver in ap.net website and then upload all files for web site including driver on sharred hosting to use mongodb remote db?
[19:42:01] <arkban> design question: where you do put your ensureIndex() statements? do you bundle them in your app code (like in the initialization stuff) or do you use a separate script that's part of the install/deploy process?
[19:45:45] <crudson> arkban: for me, it's whenever the collection gets created or initially populated (if the population process would be hindered by the index). It could well be inside application code though, if using map reduce results for example. Note that there is no real 'harm' in doing ensureIndex if it already exists.
[19:46:48] <arkban> so you tie it to creation as opposed to first access by the app? that is when your app starts up, ensure all the indexes (which might do nothing) and proceed
[19:47:18] <arkban> i'm in a situation where the collection may already exist and i'm wondering what the common practices is
[19:53:08] <crudson> crudson: databases can be used by many applications, so is it an application's responsibility to ensure that queries it needs exist? if you say yes, then perhaps ensure they are there when the app boots. There can only be harm in doing that if you have a whopping collection and a big index, the creation of which will take some time.
[20:05:30] <joshua> Hmm so there is no init script included with the mongodb rpms, just for mongod?
[20:26:56] <jgornick> Hey guys, does the aggregation framework in 2.2 allow you to aggregate over multiple collections?
[20:54:30] <crudson> jgornick: no don't think you can do that
[20:54:48] <jgornick> crudson: :(
[20:55:07] <jgornick> So the only way to do multiple collection aggregation is multiple map/reduce jobs?
[20:59:00] <crudson> jgornick: use {out:{reduce:'collection_name'}}
[20:59:33] <jgornick> crudson: Sounds good.
[21:03:54] <camonz> hi, is there a way to update a set of records like db.elec.p.update({}, {name: cne_name.pq.split(" ").slice(1).join("_")}, false) ?
[21:09:56] <crudson> camonz: https://jira.mongodb.org/browse/SERVER-458
[21:10:37] <crudson> It will happen, just not yet.
[21:10:47] <camonz> crudson: thanks :)
[21:13:22] <crudson> camonz: if you need fields that are computed like that, create them when you create/update the document. To update a batch, do db.elec.find().forEach(function(doc) doc.name=....;db.elec.save(doc)}) - non-atomic. Wrap in db.eval to block.
[21:15:27] <camonz> crudson: cool, thanks