PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 11th of December, 2013

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:02:48] <joannac> in the config file it's documented as true/false
[00:02:58] <joannac> Why are you running with quiet anyway? Not production?
[00:03:26] <justin___> Because I'm pushing logs to paper tail, and all these conniption events are way too noisey
[00:03:35] <justin___> connection
[00:04:35] <joannac> okay
[00:06:27] <joannac> Not possible to process it before going to paper?
[00:06:40] <joannac> If you ever have connection specific problems, it'll be a lot harder to debug
[02:03:11] <dgaffney> Hello. Has Anyone Really Been Far Even as do MapReduce? I am totally adrift in getting my sea legs with my first MR script and can't figure out where things went south.
[02:03:41] <dgaffney> https://gist.github.com/DGaffney/4ef629b5a3a01abc1c57 < this fellow
[02:22:50] <joannac> dgaffney: Have you gone through http://docs.mongodb.org/manual/tutorial/troubleshoot-map-function/ and http://docs.mongodb.org/manual/tutorial/troubleshoot-reduce-function/
[02:23:09] <dgaffney> as best as I could with this example...
[02:30:50] <joannac> Well, the map part should be okay
[02:30:54] <dgaffney> What seems to be happening is the result I get with is that the documents in the ultimate collection are identical to the output desired in the map
[02:30:59] <joannac> Is it working as you expect?
[02:31:24] <dgaffney> which leads me to believe that there's something goofy going on in the reduce *but* I ran the reduce with an example and got the output I was expecting...
[02:31:30] <dgaffney> and that's seeming to look really grand...
[02:31:42] <dgaffney> but it doesn't actually get stored in the collection.
[02:32:16] <joannac> Wait, you get the right output, but it's not getting written?
[02:32:48] <dgaffney> when I instantiate some vars in the repl and run my reduce function with those vars the output of the reduce function looks correct.
[02:33:18] <dgaffney> When I run the map reduce process proper, however, the result looks exactly like the documents that should be generated by the map function *not* the reduce function
[02:33:45] <dgaffney> its almost as if reduce effectively were behaving as return {key: key, value: value} and doing nothing else..
[02:34:10] <dgaffney> All of this tickles that gut feeling that there's just some silly element missing I am not aware I am not aware of...
[02:34:15] <joannac> hmm
[02:34:20] <dgaffney> that said, the other_networks object is *massive*
[02:34:25] <dgaffney> maybe that should be mentioned
[02:35:03] <dgaffney> (although its looking like it gets through the document limit size, obviously, as they get written out in the collection)
[02:37:12] <joannac> I don't understand how that could be, given your map returns a document, and your reduce returns a scores array
[02:37:49] <dgaffney> L59 is the massive object I'm referencing here - it's a set of 683 hashes each containing anywhere from 0-2000≈ object ids and a couple strings.
[02:38:02] <dgaffney> Maybe I need to shift my thinking here entirely?
[02:39:00] <dgaffney> The goal is relatively simple (maybe there's a better way for it?) - given two ids and two associated sets of arrays, generate a score that is the intersection length of those two arrays divided by the union length.
[04:37:41] <nathanielc> Anyone familiar with the internal of mongo? I am trying to figure out how splits work? I have found the split command and the applyOps queries it uses to create new chunk docs in config.chunks collection, but I haven't found where the data is actually split.
[04:37:57] <nathanielc> What is the association between chunks and the actual data on disk?
[04:38:35] <joannac> It's just metadata
[04:39:27] <nathanielc> So the data exists on the mongod instance and so when a mongos instance need to find the data it just looks up which shard has it?
[04:39:46] <joannac> Correct, in the config server
[04:39:51] <nathanielc> So the data itself isn't split until a move chunk operation later
[04:39:57] <joannac> correct
[04:40:01] <nathanielc> ok makes sense
[04:42:06] <nathanielc> Thanks, can you clarify one more thing for me? The usage between "chunk version" and "shard version" in the context of the chunk metadata. Also a little background is I am working on this jira ticket https://jira.mongodb.org/browse/SERVER-924
[04:44:46] <nathanielc> Basically I am linking two collections and then removing the metadata for one of the collections, so they can share it. But since the original versions of the two sets of chunks are different its giving me some trouble
[05:53:36] <chaotic_good> hi I really need some help on a problem I cant crack
[05:54:03] <chaotic_good> I take all the contents of the /data/mongo dir on production, and move to 3 staging nodes
[05:54:28] <chaotic_good> I want staging to run as a different replica set
[05:54:36] <chaotic_good> so I delete the local.* files
[05:54:45] <chaotic_good> I run rs.initiate()
[05:54:59] <chaotic_good> and it seems it should all work right?
[05:55:08] <chaotic_good> but I had some bizarre problems last week
[05:55:16] <chaotic_good> one time mongo even deleted all the data
[05:55:38] <chaotic_good> what is the normal process for seeding a dev environment from a prod cluster? 3 nodes?
[05:57:04] <chaotic_good> http://docs.mongodb.org/manual/faq/developers/#how-do-you-copy-all-objects-from-one-collection-to-another I am reading here as well
[05:57:11] <mongo-rooky> heh
[06:00:51] <mongo-rooky> anyone around?
[06:00:58] <ranman> for you? always
[06:01:16] <ranman> what's up?
[06:02:42] <mongo-rooky> I want to take a 3 node mongodb with just 1 replicaset, and make a second 3 node cluster basic repicaset, but with different replicaset name
[06:02:58] <mongo-rooky> I am having tourble basically restoring when I rsync the files
[06:03:20] <mongo-rooky> If I delete the local* files
[06:03:26] <mongo-rooky> then I can run rs.initiate()
[06:03:39] <mongo-rooky> and I can get my new replicaset, and then add the 2 and 3 node
[06:03:46] <mongo-rooky> this is all on the new cluster
[06:04:12] <mongo-rooky> the production data is about 195G
[06:04:19] <mongo-rooky> so its slow going each time I fail
[06:04:40] <mongo-rooky> and if I only move the files to node 1 and let it sync, mongodb stops
[06:04:41] <mongo-rooky> :(
[06:05:13] <mongo-rooky> http://docs.mongodb.org/manual/reference/method/db.cloneCollection/#db.cloneCollection I wonder if this is the way??
[06:05:32] <mongo-rooky> I mean I want to keep the prod data, about 120 small databases for various customers
[06:05:51] <mongo-rooky> but use a new replicaset, cuz I dont want the old one to get run over
[06:10:14] <mongo-rooky> hello?
[06:10:19] <mongo-rooky> scare you off?
[06:13:06] <ranman> hey
[06:13:07] <ranman> no
[06:13:09] <ranman> hold on
[06:13:11] <ranman> let me read
[06:13:50] <ranman> so you're trying to maintain 2 separate replicaset clusters?
[06:36:03] <mongo-rooky> just trying to create new cluster, with same data, but new replicaset name
[06:36:12] <mongo-rooky> I mean this seems basic to me
[06:36:20] <mongo-rooky> but I followed docs and it failed
[06:36:44] <mongo-rooky> I mean are the docs bad on purpose to push enterprize?
[06:36:46] <mongo-rooky> lol
[06:36:50] <mongo-rooky> and make u buy support
[06:36:53] <mongo-rooky> jeeeesh
[06:37:17] <ranman> so I just tried to reproduce that with a 2GB database and had no issues :/
[06:37:41] <ranman> to clarify further
[06:38:00] <ranman> you're not trying to make these replicasets talk to each other right?
[06:38:24] <ranman> you're just moving one repl set to a new set of machines?
[06:39:13] <ranman> and could you clarify why rsync isn't working?
[06:39:52] <ranman> also I'm afk for the next hour or so but maybe someone else can pickup your question while I'm gone
[06:39:54] <ranman> GL
[06:40:01] <ranman> if not I'll help when I get back
[06:41:01] <joannac> mongo-rooky: I can help but it's really not clear to me what you've tried
[06:44:46] <mongo-rooky> I rsynced the file from node 1 cluster 1
[06:44:57] <mongo-rooky> to node 1/2/3 cluster 2
[06:45:03] <mongo-rooky> 195G worth
[06:45:06] <mongo-rooky> now
[06:45:10] <mongo-rooky> what is the correct way
[06:45:16] <mongo-rooky> to keep the data
[06:45:31] <mongo-rooky> but erase the old replicaset and initiate a new one?
[06:45:56] <mongo-rooky> there are 120 files each a different client
[06:46:07] <mongo-rooky> ranging from 2g to 64M in size
[06:46:39] <mongo-rooky> basically I want to dump prod to staging
[06:46:40] <mongo-rooky> now?
[06:46:42] <mongo-rooky> how?
[06:47:00] <mongo-rooky> (sorry didnt mean to write now, mistype)
[06:50:11] <mongo-rooky> anyone awake?
[06:50:13] <mongo-rooky> :)
[06:50:30] <joannac> mongo-rooky: people here are volunteers of their time. Please be patient
[06:51:05] <mongo-rooky> :) sorry
[06:52:15] <joannac> Remove the local.* files from each db directory
[06:52:18] <joannac> start the mongods
[06:52:23] <joannac> run rs.conf()
[06:52:29] <joannac> no, rs.initiate()
[06:52:41] <joannac> on ONE (and only one) of the mongods
[06:53:24] <mongo-rooky> ok
[06:53:48] <mongo-rooky> then rs.add("hostname:port") right for 2 and 3?
[06:53:52] <joannac> right
[06:54:01] <joannac> from the same mongo shell
[06:54:03] <mongo-rooky> once thats done should the data remain? from the prod box?
[06:54:07] <mongo-rooky> yes
[06:54:10] <joannac> rs.add("host2:port2") etc
[06:54:11] <joannac> yes
[06:54:16] <mongo-rooky> ok cool
[06:54:34] <mongo-rooky> maybe I did something odd, because the data got all dropped, I will try again
[06:54:35] <mongo-rooky> thx
[06:54:58] <mongo-rooky> now I need to start node 2 and 3 to add them....
[06:55:08] <mongo-rooky> thats ok after node 1 is initiated right?
[06:55:19] <joannac> yeah
[06:55:21] <mongo-rooky> k
[06:55:46] <mongo-rooky> wel jessh I guess ima dodo then and did some quirky thing to make it drop the dbs
[06:55:51] <mongo-rooky> heh
[06:56:04] <mongo-rooky> have you done a refresh like that 100 times
[06:56:05] <mongo-rooky> ?
[06:56:17] <joannac> not really
[06:56:52] <mongo-rooky> lol
[06:57:13] <mongo-rooky> the devs are kinda handing me mongo and I usually handle linux
[06:57:28] <mongo-rooky> this is a real basic setup but helps the prod website a lot
[06:57:59] <mongo-rooky> whats next after I get it going? should I add indexes n stuff?
[06:58:29] <joannac> erm, you didn't already have indexes?
[06:58:42] <mongo-rooky> I am not sure maybe the devs have some in there.
[06:58:55] <joannac> if so, yeah, I guess. Don't forget to do a rolling index build and not lock up the secondaries
[06:58:57] <mongo-rooky> they use it as a rucksack it seems
[06:59:11] <joannac> Pick indexes that are useful for your query patterns
[06:59:12] <joannac> etc
[06:59:21] <mongo-rooky> how do I know which are?
[06:59:42] <joannac> ask the devs?
[07:00:12] <mongo-rooky> they ask ME
[07:00:13] <mongo-rooky> lol
[07:00:25] <mongo-rooky> I am like heres the linux box and newest table mongo have fun
[07:00:29] <mongo-rooky> they ar elike not so fast bud
[07:06:09] <mongo-rooky> :)
[07:06:15] <mongo-rooky> thanks for the help
[07:06:26] <mongo-rooky> I will be reminded why this failed b4
[07:06:34] <mongo-rooky> or be suprised that it all just works after all
[07:13:15] <mongo-rooky> yay!!
[07:13:16] <mongo-rooky> :)
[07:13:21] <mongo-rooky> merry christmas!!
[07:13:30] <joannac> To you too :)
[07:14:00] <joannac> You need to figure out how the collections are used
[07:14:11] <joannac> what kind of queries you run
[08:21:57] <bin> guys i have a problem
[08:22:05] <bin> when i start an arbiter
[08:22:15] <bin> it cannot connect to the rs
[08:22:19] <bin> in the log actually
[08:23:28] <bin> Wed Dec 11 08:20:28.510 [initandlisten] connection accepted from 185.217.159.143:48259 #27 (2 connections now open)
[08:23:28] <bin> Wed Dec 11 08:20:32.320 [rsStart] trying to contact 185.217.159.143:27017
[08:23:28] <bin> Wed Dec 11 08:20:32.324 [rsStart] trying to contact 187.161.123.150:27017
[08:23:28] <bin> Wed Dec 11 08:20:32.324 [rsStart] replSet info Couldn't load config yet. Sleeping 20sec and will try again.
[08:42:16] <phrearch> hello
[08:42:54] <phrearch> i was wondering if querying for a certain value for user.profile.myfield is as efficient as user.myfield
[08:45:02] <Nodex> why would it not be?
[08:51:36] <cym3try> hi guys, i have hit the connection limit on a mongo instance (1000 something) although i have not specified the maxconns conf. From the docs I read that by default mongo does not limit the connections. So how is this even possible?
[08:55:26] <bin> guys what is the syntax
[08:56:06] <bin> of EXPOSE
[08:56:08] <bin> multiple ports
[08:56:12] <bin> EXPOSE 1,2
[08:56:15] <bin> EXPOSE 1 2
[08:56:22] <bin> EXPOSE [1,2]
[08:57:26] <[AD]Turbo> hola
[09:16:44] <mark____> can anyone explain me in straight what is aggregation in mongodb??? in one line plz
[09:19:04] <joannac> takes many documents and transforms and/or collates them
[09:32:53] <mark____> @joannac: hey can u taught me the concept actually i am not been able to get the idea ???
[09:33:42] <spicewiesel> hi all
[09:34:15] <mark____> hi
[09:36:37] <spicewiesel> I had to duplicate a mongodb environment. Both environments are identical, except for IPs. So, I copied the data directory to the 3 mongod nodes and started them. The replicaset itself seems to be fine now, the indexed the data and voted for primary and secondaries. What is the next step? I cannot see the databases in mongos. Do I have to reset mongos and configsrvs to fix that? Would be fine if anyone could help me with that
[09:37:27] <joannac> did you duplicate the mongoses?
[09:37:56] <spicewiesel> nope, I only copied the mongod data directories to the mongod instances
[09:38:02] <joannac> okay
[09:38:18] <spicewiesel> I did nothing to mongos and cfgsrvs until now
[09:38:28] <joannac> http://docs.mongodb.org/manual/reference/method/sh.addShard/ ?
[09:38:34] <joannac> Wait
[09:38:39] <joannac> You copied the data?
[09:38:43] <joannac> What's the goal, here?
[09:38:55] <spicewiesel> we have a test environment
[09:39:15] <spicewiesel> there's data, and this data should now be copied to the other environment, in total
[09:41:04] <spicewiesel> the devs finished the test environment, they set up everything and loaded the data. This data should now be copied to another environment to be used there. So, what I need is a way to duplicate the whole environment
[09:42:04] <joannac> oh
[09:42:48] <joannac> so basically this http://docs.mongodb.org/manual/tutorial/migrate-sharded-cluster-to-new-hardware/
[09:44:21] <bin> i must set
[09:44:27] <bin> arbiter on a port different from 27017
[09:44:28] <bin> right ?
[09:44:42] <spicewiesel> ok, thanks. I try to get it :)
[09:46:03] <bin> right ?
[09:51:41] <spicewiesel> joannac: it seems to be fine now, maybe... :)
[09:52:30] <spicewiesel> replicaset was up, I then deletede configservers local db, restarted them. Then I started mongos and added a shard. No I can see the shard, the databases and I am able to login with usercredentials that exist only on the source environment.
[10:05:16] <cym3try> hi guys, i have hit the connection limit on a mongo instance (1000 something) although i have not specified the maxconns conf. From the docs I read that by default mongo does not limit the connections. So how is this even possible?
[10:05:59] <cym3try> i noticed that the shell limit (uname -a) is 1024, but hte process limit (cat /proc/mongo-pid/imits is actually 12k
[10:20:41] <cym3try> ok...so i found out that the problem is that RHEL and Centos override override the process ulimit
[10:20:49] <cym3try> and since mongo connections are spawned as forks..
[10:20:52] <cym3try> *facepalm*
[10:23:45] <spicewiesel> okay, that's a useful information for me, too :)
[11:15:06] <micahf> would it be reasonable to version a mongo database with git?
[11:15:13] <micahf> (or possible?)
[11:37:43] <Lope> how can I do a find that will find anything?
[11:38:09] <Nodex> eh?
[11:38:28] <Lope> well basically I want to do this: db.mycol.findOne();
[11:38:34] <Lope> but I only want to return the _id
[11:38:46] <Lope> so I tried db.mycol.findOne(null,{_id:1});
[11:38:57] <Nodex> db.foo.find({},{_id:1}).limit(1).pretty()
[11:38:59] <Lope> and various other values instead of null, such as true, false etc
[11:39:28] <Nodex> maybe findOne({},{_id:1}) works too - not sure you will have to try
[11:39:43] <Lope> oh shit, I was using the wrong database again
[11:40:26] <Lope> yeah that works
[11:40:41] <Lope> its actually what I tried originally, I was just using the wrong DB again lol :/
[11:40:47] <Nodex> lol
[11:41:08] <Lope> oh, passing in null also works
[11:41:24] <Lope> db.mycol.findOne(null,{_id: 1});
[11:42:49] <Lope> oh, the manual specifically says pass an empty document {}.
[12:00:21] <Lope> are subscriptions reactive?
[12:00:41] <Lope> ah wrong window
[12:15:37] <spicewiesel> could anyone tell me if there are some userRoles I should jsut to create a monitoring user?
[12:16:26] <spicewiesel> e.g. for the nagios monitoring tool named here http://docs.mongodb.org/manual/administration/monitoring/
[13:49:57] <dandre> Hello,
[13:51:34] <dandre> I have a document which has 2 fields of type time (t1 and t2. Is there any way find all documents matching t1 > t2 criteria?
[13:52:27] <dandre> I have tried find({"t1":{"$gt":"t2é}})
[13:53:30] <kali> dandre: not efficiently. you can get them with a where clause, or the aggregation framework. both will scan the whole table
[13:54:07] <dandre> ok I'll try where
[13:54:34] <tiller> hi there!
[13:55:49] <tiller> I'm trying to use mongodb's replication, but I've had a new server to my primary's and the state is on "Down" with: "lastHeartbeatMessage" : "still initializing"
[13:55:49] <tiller> .
[13:56:29] <tiller> Following the tutorial, I don't have to do anything particular on the mongod I want to add to the set, do I? (except put the replSet to the same as the primary)
[13:56:47] <Derick> correct
[13:57:10] <Derick> can you run rs.status() on the primary (first node in the set)?
[13:57:39] <tiller> yup, http://pastebin.com/SU0GfKfD
[13:57:49] <tiller> oh
[13:57:50] <tiller> what
[13:57:54] <tiller> it's now a secondary
[13:57:58] <tiller> it was a primary before
[13:57:59] <tiller> Oo
[13:58:13] <Derick> that makes sense
[13:58:23] <Derick> because you don't have a majority up, you can't have a primary
[13:58:31] <Derick> your config is wrong
[13:58:41] <Derick> you use "debbie" (likely localhost) and a 10.0.0.250
[13:58:54] <Derick> you always need to use IP addresses that are accesible from all hosts
[13:59:07] <Derick> 10.0.0.250 would not know what "debbie" is
[13:59:17] <dandre> kali: thanks it works fine!
[13:59:40] <tiller> yes you're right. MongoDB choosed debbie by itself when I did the initialize() ;o
[14:00:03] <tiller> so, rs.remove & rs.add?
[14:00:15] <kali> dandre: jsut be aware it's a performance killer
[14:00:19] <Derick> not sure whether you can remove the local machine
[14:00:25] <Derick> tiller: I think you need to do this:
[14:00:31] <Derick> cfg = rs.status()
[14:00:58] <Derick> cfg.members[0].name = "10.0.0.155" (or whatever the IP is)
[14:01:03] <Derick> and then rs.reconfig(cfg)
[14:01:06] <Derick> and hope
[14:01:09] <Derick> this is not production, is it?
[14:01:19] <tiller> no it's not. Don't worry
[14:02:00] <tiller> "replSetReconfig command must be sent to the current replica set primary." =/
[14:03:00] <Derick> tiller: so remove the secondary first
[14:03:08] <dandre> ok
[14:03:14] <tiller> If I add "debbie" in the second's server hosts, it should also do the trick, right?
[14:03:22] <Derick> yes, but that is a hack
[14:03:25] <Derick> don't do that :P
[14:03:43] <tiller> yes but then I could gain the primary state, and do the update ;o
[14:03:47] <tiller> But I'll do as you say :)
[14:04:48] <tiller> The same happen with rs.remove() : "replSetReconfig command must be sent to the current replica set primary."
[14:05:16] <Derick> hmm
[14:06:29] <tiller> ok, I changed the hosts file. Now I've Primary & Recovering
[14:06:43] <Derick> whoop
[14:06:50] <tiller> I'll wait until the state change to secondary, and then I'll set the good ip
[14:06:55] <Derick> and when you have that done, make sure you reconfigre with two IPs
[14:07:26] <tiller> (Well, I think it will change from recovering to seocndary?)
[14:07:42] <Derick> it should!
[14:07:47] <Derick> it needs to copy over data
[14:07:51] <Derick> (if your primary had data)
[14:07:53] <tiller> oh shi-
[14:07:56] <tiller> I just remembered..
[14:08:03] <tiller> I've a gigabytes collection..
[14:08:09] <Derick> then it will take some time :-)
[14:11:06] <tiller> I might run out of disk space soon ;p
[14:11:12] <tiller> but hey, thanks for the hand!
[14:15:05] <tiller> (Derick> And it was cfg = rs.conf() not .status() :p)
[14:15:17] <Derick> sorry :)
[14:17:08] <tiller> Can we replicate only specific databases?
[14:17:21] <Derick> nope, replication is everything
[14:18:27] <cheeser> but "local" of course
[14:18:29] <cheeser> :D
[14:18:55] <Derick> yes
[14:19:07] <tiller> oh
[14:19:16] <Derick> tiller: don't listen to cheeser - local is special
[14:19:54] <tiller> meeh. My GBs collection is only for a temporary test on something else
[14:21:26] <tiller> Is a very bad thing I souldn't do, or just a bad thing I might not do? ;o
[14:21:33] <cheeser> Derick: but i'm right! :)
[14:21:43] <Derick> tiller: very bad thing
[14:21:46] <Derick> cheeser: go tell him
[14:22:02] <cheeser> yeah. don't muck with local
[14:22:09] <tiller> m'okay :(
[14:39:14] <tiller> cp -R /var/lib/mongodb /var/lib/mongodb2
[14:39:48] <tiller> and then set dbpath=/var/lib/mongodb2
[14:40:04] <tiller> Was I too naive? Because mongod doesn't seem to want to start with this config
[14:40:23] <Nodex> permissions?
[14:40:44] <tiller> oh right!
[14:41:14] <tiller> Nodex> Good call
[14:41:46] <Nodex> not often wrong ;)
[15:59:38] <tiller> Is there some kind of "flush" with MongoDB?
[15:59:58] <tiller> I dropped my big collection, and my /var/lib/mongodb folder is still at 17 GB
[16:00:23] <ron> dropped or removed?
[16:00:29] <tiller> drop
[16:05:26] <tg2> Do a repair so it commits it to disk
[16:06:01] <tg2> Repair takes space on it's own
[16:06:37] <cheeser> you'll just lock your db while that runs.
[16:07:05] <tiller> hum weird
[16:07:14] <tg2> So if you don't have enough space for that (it will warn you), then you can do a mongodump, remove the db, and then a mongo restore
[16:07:21] <tiller> Should I remove instead of drop the next time?
[16:07:43] <tg2> Remove doesn't free the disk space either
[16:08:03] <tg2> It's quite shotty in this respect
[16:08:29] <cheeser> shoddy ;)
[16:08:44] <tg2> ^
[16:09:02] <tg2> No reason other than laziness that this hasn't been addressed
[16:09:05] <tg2> Lol
[16:09:24] <cheeser> try not to speak on behalf of others
[16:09:27] <tiller> "If you are running as part of a replica set, you should always restore from a backup or restart the mongod instance with an empty dbpath and allow MongoDB to perform an initial sync to restore the data."
[16:09:35] <tiller> The option of the empty dbpath may be better?
[16:51:57] <intellix> can I do something like: db.collection.update({ "field": "column" }, { $set: { fieldName: "test" } });
[16:52:12] <intellix> basically, have a variable called fieldName, as I want to do something like "something." + variable
[17:03:23] <tkeith> Is searching by string prefix nearly as fast as searching by full string?
[17:03:33] <tkeith> assuming I have an index on the field
[17:09:20] <micahf> is it possible to version data in mongo with git
[17:09:21] <micahf> ?
[17:13:46] <stashspot> does any know how to connect to mongo if my password has an @ sign in it?
[17:14:03] <ranman> on the commandline?
[17:14:21] <stashspot> using nodes native driver
[17:14:23] <stashspot> node.js
[17:14:45] <stashspot> command line works, but most of the drivers expect the connection string in the user:pass@host format
[17:14:55] <stashspot> and because the password has an at sign, it breaks this convention
[17:15:36] <ranman> with nodejs, I'm not sure :/ it should expose the db.authenticate command though
[17:16:47] <stashspot> but i can't even connect because it requires credentials
[17:16:58] <stashspot> i have db.connect before i can call db.auth no?
[17:17:17] <ranman> no, connection and authentication are two different processes AFAIK
[17:18:26] <ranman> stashspot: example, http://mongodb.github.io/node-mongodb-native/api-generated/db.html#authenticate
[17:20:05] <stashspot> so you're saying i should be able to call db.open, even if auth is required
[17:28:25] <stashspot> ?
[17:28:57] <ranman> stashspot: yes, you won't be able to do anything until after you authenticate though
[17:29:24] <stashspot> any ideas on why i can connect to this remote db using various desktop clients
[17:29:36] <stashspot> but when i try to call db.connect it just hangs and then timesout?
[17:29:59] <ranman> can you connect with the example in the docs?
[17:32:16] <stashspot> i cannot
[17:32:22] <stashspot> but if i fire up say mongohub
[17:32:24] <stashspot> it works just fine
[17:32:33] <stashspot> using all of the same connection params
[17:33:00] <ranman> I'm not familiar with javascript -- could the @ symbol need escaping?
[17:33:47] <stashspot> tried to no avail
[17:33:48] <stashspot> %30
[17:33:50] <stashspot> i mean %40
[17:33:56] <stashspot> { [MongoError: auth fails] name: 'MongoError', errmsg: 'auth fails', ok: 0 }
[17:34:08] <stashspot> thats what i get if i leave off user:pass and try to just do db.connect
[17:34:18] <stashspot> it never makes it to db.auth because the connection is never able to be opened
[17:35:23] <tg2> Sniff the packets on the machine that's connecting and look at how it's sending the string
[17:35:31] <tg2> it's likely escaping
[17:35:58] <stashspot> i mean i can't even connect without the user:pass in the conn string
[17:36:08] <stashspot> in node, I'm supposed to call db.open and then call db.auth within the callback
[17:36:14] <stashspot> but db.open complains about not being auth'd
[17:36:28] <stashspot> so how the F am i supposed to call db.auth if i can't db.connect because i need to auth
[17:36:31] <stashspot> drives me insane
[17:36:31] <ranman> stashspot you may have to connect to only the admin DB to auth
[17:37:32] <stashspot> @ranman - I've tried and it doesn't work. i also tried not adding a db to the conn string, i tried /test and i tried a db i know exists
[17:37:59] <tg2> http://mathiasbynens.be/notes/javascript-escapes
[17:38:56] <tg2> Stashpot, try connecting with telnet on the port
[17:39:48] <stashspot> i have no issues connecting via commandline
[17:39:54] <stashspot> or other clients
[17:42:17] <ranman> stashspot can you post your code with the password elided in a gist?
[17:42:25] <tg2> From node?
[17:42:53] <tg2> It's probably your node lib if it's only doing it from there
[17:43:06] <stashspot> that's what i think
[17:43:21] <stashspot> I'm chatting with the node room
[17:43:23] <ranman> seems unlikely that the node driver would not support that. There is a JIRA ticket from July 2013 saying this was fixed in all drivers
[17:43:28] <stashspot> let me do some more deep diving
[17:56:57] <stashspot> ladies and gents, there is a .connect option called *drum roll*: uri_decode_auth
[17:57:18] <stashspot> if i uriencode the user and pass, boom - at signs are non-issue
[18:21:12] <stashspot> any ideas on why it would take about 4 mins for db.connect to return?
[18:22:51] <ron> long distance
[18:29:59] <tg2> lol
[19:50:05] <lifechamp> is raid 5 ok for mongodb?
[19:53:00] <platzhirsch> I have 90k documents which I retrieve over an association. I need to get all values from an attribute. This seems to take too long time. I have already added an index for this value. Can I still improve this?
[20:00:54] <kali> platzhirsch: can you show us an example ,
[20:00:55] <kali> ?
[20:01:10] <platzhirsch> sure, let's see if I can get the raw query
[20:07:52] <platzhirsch> kali: Seems Moped cannot translate my query to a raw query string. How should I write up the example? Basically the query has a selector, which is the identifier of the document to distinguish the documents from and I only select one field, the attribute I need
[20:09:04] <kali> go to the mongodb shell, and reproduce the queries
[20:14:34] <platzhirsch> kali: alright, I assume this is the query: db.metadata_records.find({"snapshot_id": ObjectId("5288e6099207ff2094000008")}, {"score": 1})
[20:15:59] <kali> ok. what index have you created ?
[20:17:27] <platzhirsch> kali: Since snapshot is a nested document of repository I have created: snapshots.metadata_records, and metadata_records.score
[20:19:02] <platzhirsch> ah and "snapshot_id" for metadata_records
[20:19:13] <kali> i don't know anything about "repository", but your metadata_record query will probably get a boost with a composite index: { snapshot_id:1, score: 1}
[20:19:28] <platzhirsch> ah ok, that's a nice idea
[20:19:59] <kali> that is... if you can get moped to pass "_id : false" on the projection field
[20:20:26] <kali> the query has to be: db.metadata_records.find({"snapshot_id": ObjectId("5288e6099207ff2094000008")}, {"score": 1, _id: false})
[20:24:02] <platzhirsch> kali: Why the _id: false? so the default field is omitted?
[20:25:04] <kali> yes. you want a covered query: http://docs.mongodb.org/manual/tutorial/create-indexes-to-support-queries/#create-indexes-that-support-covered-queries
[20:32:14] <platzhirsch> kali: great. I guess Mongoid (Ruby) does that already when I specify .only(:score), but I try to assert this. It seems to be quicker already :)
[20:32:42] <platzhirsch> I guess an alternative would be to store the result in snapshot directly? 90k floating point numbers..., but I am not so sure, whether this is appropriate
[20:38:53] <kali> i'm not sure what we are talking about. i need to see some sample document to understand what we're dealing with :)
[20:40:42] <platzhirsch> I am afraid they are all a bit too big, I think it's okay ^^
[20:49:25] <platzhirsch> darn, _id seems not to be set to false in the query. Wouldn't be an alternative to include the id in the index¿
[20:51:30] <platzhirsch> but more out of interest if this is worse for the performance, I can fix the query somehow ^^
[20:52:30] <platzhirsch> wooosh, a lot faster
[21:29:33] <Sawbones> Has anyone had trouble managing their mongo database for big projects?
[21:30:17] <joannac> Sawbones: Define "big"
[21:30:41] <Sawbones> Equivilent to 15 tables on an SQL database
[21:30:53] <tg2> Define it in size
[21:31:09] <tg2> Records/size on disk
[21:31:29] <Sawbones> I'm not really sure how to define that, maybe like 3gb of data?
[21:31:31] <tg2> Managing = maintenance or scaling?
[21:31:43] <Sawbones> Maintenance
[21:31:49] <tg2> It's pretty simple to manage
[21:31:58] <Sawbones> hmm
[21:31:59] <Sawbones> ok
[21:32:03] <tg2> Just don't run out of space :)
[21:32:29] <tg2> You can host it on mongolabs if you don't want to administer anything
[21:32:37] <tg2> Amazon backed
[21:33:06] <joannac> Sawbones: that's tiny.
[21:33:17] <Sawbones> Another question, say in an SQL db you have two tables with a many to many relationship (intermediate table) I know you could set it up the same way in Mongo but is there a better way to do it since I'm not as restricted?
[21:33:46] <tg2> You have to architect it differently
[21:36:30] <Sawbones> how so?
[21:37:49] <tg2> You can have two collections and each one has the references from the other
[21:38:19] <tg2> So you'd have an object with a subarray of objects from the other "table"
[21:38:29] <Sawbones> Oh i see what you mean
[21:38:34] <tg2> So you don't have to do lookups or joins
[21:38:39] <Sawbones> very cool, I can have just two tables instead of three
[21:38:49] <tg2> You could have one
[21:38:51] <tg2> if you want
[21:39:00] <tg2> Depends on your data
[21:39:04] <Sawbones> I need these to be exclusive
[21:39:14] <tg2> Yeah then keep two collections
[21:39:23] <Sawbones> Thanks you
[21:39:27] <tg2> In each you keep the elements of the other
[21:39:38] <tg2> as subarrays
[21:39:47] <tg2> Requires a bit more application logic
[21:39:57] <tg2> But faster usually
[21:40:09] <Sawbones> Would these subarrays be duplicate values or would it reference the other collection?
[21:40:15] <tg2> Riak can do pointers
[21:40:31] <tg2> They would be duplicate copies, think of it like a cache
[21:40:45] <tg2> You are caching the related objects inside the main object
[21:40:46] <Sawbones> hmm so if I updated the other collection it would not update?
[21:41:21] <tg2> Exactly you'd have to, upon updating the second, check the items in the first and update them accordingly
[21:41:44] <Sawbones> Oh god that sounds terrible
[21:41:46] <tg2> You can also just keep the keys in a subarray and do a second lookup
[21:41:52] <tg2> It's like caching
[21:41:56] <Sawbones> I'll do a second look-up
[21:42:02] <tg2> Except you have more control
[21:42:30] <tg2> You just need to keep the cache updated when it's foreign objects are updated
[21:43:14] <Sawbones> With the direction I plan to go I don't really like that since one of those collections will have a few other many-to-many relationships
[21:43:22] <tg2> Look into riak it allows "foreign" keys like pointers that are automatically dereferenced when you load it
[21:43:48] <tg2> Yeah in that case keep an array of object id's from the other collection
[21:44:12] <tg2> Lookup by object id is very fast like sql primary key
[21:44:58] <tg2> Not sure why mongo doesn't support pointers
[21:45:31] <lazypower> Did the MMS service change their requiements for reporting, that its now an Auth Key and a Secret Key combo?
[21:52:41] <cheeser> pointers?
[21:52:58] <cheeser> you mean foreign keys?
[22:00:22] <tg2> References, pointers, foreign keys
[22:00:26] <tg2> take your pick
[22:06:11] <cheeser> well, they kinda all mean different things.
[22:06:28] <cheeser> but mongo does support FKs/references in the form of DBRef
[22:06:47] <cheeser> they're just not enforced like a FK in pgsql would be
[22:56:15] <platzhirsch> If my query looks like this: db.metadata_records.find({"snapshot_id": ObjectId("5288e6099207ff2094000008")}, {"score": 1, "_id": 0}) What's the index to support this? db.metadata_records.ensureIndex({ snapshot_id: 1, score: 1 }) ?