[07:28:06] <joannac> luaboy: that doesn't answer my question
[07:28:30] <joannac> luaboy: why do you need to run splitVector?
[09:09:09] <f31n> hi is it possible to exclude one element form a collection?: for example i wanna fetch the whole user collection without the user name
[09:43:40] <luaboy> joannac:I need to deal with data with a number of workers
[09:44:06] <luaboy> so split it into several vectors
[09:49:32] <joannac> luaboy: just do it from config.chunks and split up your work that way
[09:58:34] <bogn> f31n: {username: {$ne: 'http://docs.mongodb.org/manual/reference/operator/query/ne/#op._S_ne'}} is your query
[10:01:07] <luaboy> joannac:where is config.chunks ?
[10:02:55] <luaboy> joannac:I have insert much daa into db, then runCommand({'splitVector': 'test.wc', 'maxChunkSize':1330, 'keyPattern': {'_id': 1}, 'force':0}),gives out { "splitKeys" : [ ], "ok" : 1 }
[10:03:31] <luaboy> why the result splitKeys is empty?
[10:10:05] <bogn> you're welcome, see the other query and update parameters there as well
[11:27:00] <arussel> when using mongodb, I've got this error: "/usr/bin/mongodump: Argument list too long"
[11:27:10] <arussel> does this comes from the shell or from mongodump ?
[11:41:25] <arussel> how can you pass a long document to mongodump considering most linux system got a limit of 128kB ? is there a way to pass a file or something else ?
[11:52:44] <crised_> Looking for a HA multi zone hosted mongodb in amazon aws, I don't need much resources... 2 GB ram, and 20 GB SSD will be sufficient. Should I go for compose.io or mongolab?
[12:27:45] <nawadanp> Hi. Is there any way to show the chunk migration queue on a mongos (2.6.10) ?
[12:46:54] <Hanumaan> what does this error mean "database error: unauthorized db:titpit lock type:-1 client:10.10.45.45" ?
[12:47:25] <Hanumaan> is this because some port is blocked or access to database is not there?
[13:08:20] <luaboy> IcePic: what is the relationship or diff between 'distributed' and 'parallel computing'?
[13:08:28] <luaboy> what is the relationship or diff between 'distributed' and 'parallel computing'?
[13:44:52] <nfo> Hi. Can I set a different value for "replIndexPrefetch" on each replica ? I'd like to keep indexes on secondary servers which take queries, and not keep them on an hidden replica only responsible for backup/snapshots.
[13:53:07] <nfo> (Also known as "replication.secondaryIndexPrefetch" in the new documentation format: http://docs.mongodb.org/manual/reference/configuration-options/#replication.secondaryIndexPrefetch)
[13:53:19] <Petazz> Hi! I'm looking for best practices on increasing my mongo server storage size without any downtime but can't really find a tutorial for it. Is there one?
[13:54:36] <cheeser> add a replica set member, step down after sync swapp out hardware.
[14:15:29] <Petazz> What would be a best practice to keep the service running while scaling up the disk size?
[14:34:48] <cheeser> unless you're using LVM, adding disk space will almost certainly mean taking that machine down.
[14:43:44] <doge__> Is try.mongodb.org moved somewhere?
[14:43:54] <Hanumaan> trying to connect from remote machine to mongodb and it gives folowing error .. "assertion" : "unauthorized db:admin lock type:-1 client:x"
[15:00:51] <arussel> Petazz: are you using a replica set ?
[15:00:56] <Petazz> cheeser: Even with a replica set?
[15:03:12] <cheeser> if you're on AWS, just spin up a new replica set member, let the sync finish (maybe...) and spin down the old node and terminate it.
[15:03:35] <Petazz> cheeser: I get your point but I think it is very important and relative, since changing the underlying HW usually requires operations on the SW
[15:04:00] <arussel> Petazz: take down the node, create a snapshot, create a volume from the snapshot with increased size, take the node up, attache back to the set
[15:04:10] <cheeser> mongodb can only stay so long as the OS allows for such things.
[15:04:13] <Petazz> So I could avoid downtime by forcing a new master before switching?
[15:06:25] <Petazz> It said on the replica set page that the automatic failover requires 10 secs of silence from the primary but doing a stepDown or sorts should not result in that?
[15:11:59] <Petazz> Ok cool thanks a lot guys :) I was surprised that I couldn't find any blog posts on what is a best practice on how to manage increasing datasets in terms of disk space
[15:23:41] <fxmulder> so I have added a new replica set and setup sharding, it migrated data to the new replicas. The old replicas didn't go down in disk space usage which I expected but usage seems to be creeping up still and looking at the data files it looks like it has been creating new 2gb collection files along the way
[15:26:21] <fxmulder> I'm currently at 88% usage so I won't last too much longer before running out of space
[15:34:14] <Hanumaan> connection is getting accepted but unable to see the data. comes up with this error "database error: unauthorized db:titpit lock type:-1 client:10.10.45.45" any solution ..
[15:42:46] <MeStesso> does anyone have experience backing up MongoDB databases running in a Docker container? (with an attached data-only container for data)
[15:43:29] <MeStesso> i’m looking for some advices. some people suggest using mongodump, others to go into the data container and TAR all the files (can i do that while the server is running?)
[16:00:57] <GothAlice> MeStesso: Of note, unless absolutely required due to data size issues, I avoid filesystem snapshots as a method of "backup". They're hard to do right, and add too many edge cases for automation when recovering for me; I need my servers to repair themselves as simply as possible.
[16:03:44] <MeStesso> GothAlice: so you’d recommend using mongodump then?
[16:04:01] <MeStesso> GothAlice: replication is not really a backup; we’ll certainly do that, but for now money isn’t allowing it ;)
[16:04:51] <GothAlice> If I correctly surmised you have a standalone, then yeah, mongodump is the tool to use. It's important to note that when not operating under a replica set mongodump _can not_ produce point-in-time snapshots, as there is no oplog to store alongside the backup data.
[16:05:11] <GothAlice> MeStesso: Also, replication certainly is a backup.
[16:05:42] <GothAlice> I run two replicas outside the datacenter in our office, one live, one 24-hour delayed. Additionally, MMS is effectively acting as another offsite replica.
[16:07:39] <crised_> Is it a good idea to "expose" completely to the internet a mongodb server?
[16:07:43] <GothAlice> MMS gives me a web-based interface to snapshot or point-in-time restores, the in-house replicas let me efficiently run queries locally while also protecting against the datacenter going away. If the DC goes away, I spin up two nodes on any host in the world, and bam, our database is fully operational again. My production database servers don't even have permanent storage: they're RAMdisk.
[16:09:12] <StephenLynx> only let the minimal amount of ip's connect to it.
[16:09:21] <GothAlice> crised_: It's possible to get real SSL certificates for free. SSL certs are all, by definition, signed. (Some are just "self-signed'.)
[16:09:30] <GothAlice> StephenLynx: That's a false sense of security.
[16:10:28] <GothAlice> I run a dedicated VLAN with IPSec.
[16:11:11] <StephenLynx> hm, my kung-fu has a long way to do regarding sysadminning.
[16:11:13] <GothAlice> Any host that needs DB access is added to the VLAN on a third network interface. (eth0 = public, eth1 = datacenter-local, but shared amongst VM customers, eth2 = MongoDB private VLAN.)
[16:11:43] <arussel> Is there a way to pass to mongodump a --query from a file ?
[16:11:57] <StephenLynx> I remember my boss telling me something about ip's that are not actual valid ip's.
[16:12:10] <GothAlice> arussel: The standard UNIX tool "xargs" may help you there. "man xargs" to discover its awesomeness.
[16:12:22] <GothAlice> StephenLynx: Certain IP ranges are allocated for special use.
[16:13:09] <arussel> GothAlice: not sure how it could help, I'm tryng to get around the argv limit
[16:13:42] <GothAlice> Then it's a single argument.
[16:14:30] <arussel> GothAlice: the limit is on the size, not the number
[16:14:33] <GothAlice> And luckily modern GCC-compiled software isn't limited to 64K of arguments, either. (It adds a stub to software to process more.)
[16:15:11] <arussel> my query is huge as I'm passing 3000 UUID
[16:15:16] <GothAlice> If your query is hitting modern GNU Linux/GCC argument size limits then there's something seriously wrong with a) your query, and b) your data that requires such a query.
[16:15:40] <GothAlice> Might I suggest adding a field to those records which pre-computes a simple boolean to indicate inclusion in that dump?
[16:15:49] <arussel> my query is not wrong, I'm just filtering document based on their ids.
[16:16:08] <GothAlice> That's pretty much the single worst way to search for large quantities of records. :P
[16:16:29] <GothAlice> You're supplying a query… you're searching.
[16:17:59] <GothAlice> So, the question comes down to: what was the intent of selecting for export that way? A simple boolean field and query for everything with that field set to true would be infinitely simpler, and completely avoid the arglist issues.
[16:21:47] <MeStesso> good suggestion to keep a replica (offline) on-premises
[16:21:53] <deathanchor> GothAlice: yeah I have a dev suggesting to me do $in for updates to "batch" the updates which changes 100K updates down to 1000 updates if each one had 100 items in the $in.
[16:22:14] <GothAlice> But how are you even getting that list in the first place?
[18:11:01] <redsand> question, using the php driver (1.6.x) how can I use the aggregate command against a collection AND ALSO request partial results?
[18:11:17] <redsand> I can only specify this via MongoCursor object with the function call ->partial(true)
[18:11:25] <arussel> GothAlice: we have data analysts who have read access on some of our collections. They want to be able to do some computation on their machine so compute what would be a good representation of the overwhole data and extract it => they have a list of document ids they want to dump. Then they copy it to their local machine and do whatever they want with it.
[18:13:14] <arussel> GothAlice: I could add on each document a list of analysts that would want the document and recompute it each time the analyst wants a new set, but being able to pass the list of ids in a file is much easier
[18:18:00] <deathanchor> any way to add a hint to a runcommand?
[18:21:35] <crazydip> Can't find an official repo for the newest Debian stable (8 / Jessie). Can someone help?
[18:43:17] <crised_> I love the idea of DynamoDB, but somehow it doesn't feel very popular like mongodb, Why this happens? What does DynamoDB lacks?
[18:52:14] <crazydip> crised_: MongoDB and DynamoDB are trying to solve different problems - they're two different types of databases. DynamoDB is a KV store, so it's more like Membase or Riak, not mongo
[18:55:22] <crazydip> although for a database as "popular" as MongoDB I'm shocked they don't have an official repo for the newest Debian stable
[18:56:13] <crised_> crazydip: DB: supports both document and key-value store models.
[19:04:48] <brotatochip> hey guys, anybody available to help me figure out the best way to set up a DR instance of MongoDB? i've attempted to set up a 2 member, master-slave replica set according to the mongodb 2.6 documentation but I'm seeing an error.
[19:04:59] <brotatochip> here's a pastebin of my config, status, and log: http://pastebin.com/p7T4zZ5F
[19:47:48] <brotatochip> cheeser: essentially I need a failover for my production standalone instance, but not necessarily a 3 member replica set
[19:48:30] <GothAlice> crazydip: Wait, people still use binary packages? ¬_¬
[19:48:38] <brotatochip> if need be I could set up an arbiter although I'd prefer to not have to spin up a 3rd VM unless it's necessary
[20:17:02] <crazydip> GothAlice: wait, people still compile from source every time they need something? that's so 1990, been there done that :)
[20:18:21] <GothAlice> crazydip: Considering a kernel compiles in ~45 seconds for me, and the result boots faster, uses less RAM, and is more finely tuned for the hardware, it's somewhat less "1990". Takes longer for Ubuntu to download the binary kernel and install it. ;)
[20:19:25] <joeyjones> Using Amazon Linux on AWS I don't think that really applies
[20:20:02] <GothAlice> Hmm, that's a good question. I don't have AWS handy, but if you spin up a brand new Amazon Linux install, what's the process list and memory stats just after boot?
[20:22:56] <crazydip> GothAlice: but in all seriousness, any idea on Debian 8 official monogodb packages and why it's taking so long (or where they are)?
[20:38:25] <GothAlice> crazydip: You might find something usable in http://dl.mongodb.org/dl/linux/
[20:38:32] <GothAlice> These are community builds.
[20:41:42] <crazydip> GothAlice: thanks! unfortunately all the debian ones are built against the old version of Debian 7.1... Debian 8 had major changes, including it now uses systemd instead of sysv so using Debian 7 to even re-build the package for Debian 8 won't work reliably
[20:42:12] <GothAlice> crazydip: Since you use Debian, how do I get "lsof" installed?
[20:43:32] <crazydip> GothAlice: like nay other package: apt-get install lsof
[20:43:44] <GothAlice> Yeah, tried that first, didn't work on Debian 8.
[20:43:58] <GothAlice> "Unable to locate package."
[20:45:30] <crazydip> GothAlice: your sources are screwed up then, it's there, it should have been installed by default.... I just checked on my Debian 8 fresh bare netinstall (like totally bare only a few hundred mb for the whole distro) and lsof was installed
[20:45:41] <GothAlice> I'm using a stock Rackspace image.
[20:46:30] <crazydip> GothAlice: sounds like those "Rackspace" people did funny things to their image then
[20:47:15] <crazydip> maybe they're forcing you to use their repo instead of debians by default?
[20:47:34] <crazydip> and some nutjob forgot to port all the packages over
[20:48:07] <GothAlice> https://gist.github.com/amcgregor/bfbc3dabdcae369eb389 is the comparison between Debian 8, Ubuntu 15.04, and Gentoo, without Debian lsof results. (Just in the RAM stats, Ubuntu consumes 61 MB of real RAM, Debian consumes 49 MB of real RAM, and Gentoo uses 39. Gentoo is also ~3-4 seconds faster on kernel boot, though Debian is still faster, there, due to a more optimized kernel.)
[20:49:10] <GothAlice> (All on the same class of VM.)
[20:50:58] <brotatochip> GothAlice: why would anyone run Gentoo without an optimized kernel? That's like buying a Ferrari with an automatic transmission
[20:51:15] <GothAlice> brotatochip: The default kernel uses an autodetection process and compiles 99% of everything as modules.
[20:51:26] <crazydip> yes yes I used to be a Gentoo Ricer too! started in 2002 no less, when there was pratically zero documentation and definitely non of the helper stuff that's available now on Gentoo... Drobbins was helping me out on IRC hah! those were the days!
[20:51:27] <GothAlice> It's the ultimate level of compatibility, but slow as a dog on startup.
[20:51:45] <brotatochip> That's badass. Gotta love Gentoo.
[20:52:12] <GothAlice> This comparison is an "out of the box" one, i.e. click two buttons to spin up a VM, this is what you get when you first SSH in. :)
[20:52:16] <brotatochip> Wow crazydip, and I thought the first time I built gentoo in 2006 I had it rough
[20:58:58] <crazydip> I'm making a mirror of Gentoo is Rice because in 10yrs time I'm going to reminisce about 2004 & Gentoo :D
[20:59:25] <GothAlice> crazydip: Automating servers also huge fun, with gems like the following to find all init.d scripts for all services affected by a git pull: git diff --name-only @{1}.. | xargs qfile -Cq | xargs --no-run-if-empty equery files | grep init.d/ | sort -u
[21:01:58] <crazydip> from Gentoo is Rice: "Why in the hell are using software compiled for a 386 on a Pentium 4 class machine?"
[21:02:45] <GothAlice> Well, notably things like vector instructions, crypto instructions, and the various SSE levels are things i386-compiled software won't benefit from, that *do* have significant impacts on performance.
[21:03:27] <GothAlice> But all of my machines are 64-bit pure these days, anyway, so slightly less legacy to keep around.
[21:04:01] <crazydip> "Watching shit scroll by for hours makes me a Linux expert overnight!"
[21:04:45] <crazydip> ^ but in truth, Gentoo was my first real distro, without any docs or helper scripts it really made me understand the workings of Linux... really well...
[21:05:22] <crazydip> and took weeks to get up and running too! :D
[21:06:23] <GothAlice> First thing I turn off. Running parallel --jobs inherently hides the individual command output in favour of nice progress output. :) And yeah, if one wants to learn one might as well dive into the deep end. Now that stage3 is the preferred method new installations can take < 10 minutes, for servers. Times have changed a bit. ^_^; (stage1 was insane, using GCC to recompile GCC to recompile GCC…)
[21:08:52] <crazydip> stage1 was the only stage when I was a ricer... all without docs! but Drobbins was there to help out, guide us first-timers, and from our "shitty" experience write the docs :D
[21:12:12] <crazydip> GothAlice: any idea who's responsible for package building for MongoDB?
[21:12:32] <GothAlice> Alas, no, though I'd open a ticket on JIRA if I were you. :)
[21:13:01] <crazydip> damn, another service I need to sign up for
[21:52:27] <brotatochip> so is anyone able to help me figure out what's wrong with my setup or recommend a better option for disaster recovery?
[21:52:49] <brotatochip> here's my config, log, and status http://pastebin.com/p7T4zZ5F
[22:36:18] <viritt> well the mode in which I can open mongo through the terminal or java and operate in it, like see what items are on a specific collection for example using db.collection.find()
[22:36:54] <joannac> starting a replica set still allows you to do that?
[23:29:04] <windsurf_> I need to store values to represent one of, 'none, light, medium, heavy'. Is there a big disadvantage to storing those as strings vs 0 to 3 ints?
[23:41:05] <viritt> joannac: I tried to follow that tutorial but I am not sure how to set replica set and primary DB in the same machine
[23:42:26] <viritt> its strange, if I run with the default config file specified (mongod --config /etc/mongo.conf) it looks like that mongo opens a different database than running with no config
[23:42:49] <viritt> and I don't know how to set the replica