[00:09:23] <sahat> Hi, we have a Mongoose collection that has a field of type array which references another collection. If we are going to remove an item from that array, will it also delete a reference from another collection like SQL and ForeignKey ?
[00:09:34] <sahat> E.G. Game collection has comments array field which references Comment schema. If I delete an item from comments array, will it also delete that object from Comments collection?
[00:41:21] <pottersky> just watched a video on CouchDB App (CouchApp)... looks a bit of an overstretching to me
[00:41:39] <pottersky> u know if theres something similar on MongoDB landscape?
[01:15:43] <thelinuxkid> Shameless HN plug: I created a Python project called Gumjabi that ties the Ping API with Kajabi and it's mentioned in the article. Please upvote. http://news.ycombinator.com/item?id=5033820. Thanks!
[03:35:37] <gjaldon> hi there! i'm following the guide on "getting started with mongodb javascript shell" here:http://docs.mongodb.org/manual/tutorial/getting-started/#create-a-collection-and-insert-documents
[03:36:24] <gjaldon> but I'm getting an error when trying to do the 'Insert multiple docs using a for loop'
[03:36:38] <gjaldon> for (var i = 1; i <= 20; i++) db.things.insert( { x : 4 , j : i } )
[03:37:02] <gjaldon> the error I get: SyntaxError: invalid XML name (shell):1
[03:38:21] <gjaldon> nvm, looks like the spaces matter..lol
[03:48:37] <Guest_1448> is it safe to run repairDatabase() on a "live" database?
[03:59:42] <Kane`> having some troubles with compound indexes. http://codepad.org/yJFtIPJV - the output of .explain() shows they're not being used in the query :/
[04:05:16] <IAD> Kane`: try an {mac, site, metadata.date} or {site, mac, metadata.date} index
[04:08:07] <Kane`> IAD, {mac, site, metadata.date}? not {metadata.mac, metadata.site, metadata.date}?
[04:14:47] <IAD> Kane`: you are right {metadata.mac, metadata.site, metadata.date}
[04:25:31] <Kane`> instead of find({'metadata.date': , 'metadata.site': , 'metadata.mac': })
[04:36:04] <Guest_1448> so if I do a lot of .inserts() (with write concern 0) then .count() afterwards, it seems my count() waits until all the inserts are done
[05:54:08] <nemothekid> I was wondering are there any performance issues to worry about when dealing with large documents, with many indexes, or would it be wiser to simply split them into several collections
[05:54:36] <nemothekid> From a mongo perspective, transfer speed isn't important
[06:07:01] <crudson> nemothekid: that's hard to answer without knowing specifics, but normally I'd say unless index sizes are becoming a real problem keep the data logically connected
[06:10:48] <josh-k> hi guys, i need some help fixing an issue with mongodb on Travis CI (i am from the Travis team)
[06:11:21] <josh-k> we are changing to a new VM setup, and since moving to it mongodb won't start and we get this error in the logs https://gist.github.com/0c5491a1f24f83316db6
[06:12:39] <nemothekid> crudson: Thanks, I thought so. the data I have is often queried together and is hardly ever updated, however I'm expecting the collection size to eventually grow to the 10s of millions, and we have other collections like this but they are relatively small. All in all we just don't want to effect query time
[06:14:06] <crudson> josh-k: does /var/lib/mongodb exist with correct permissions? That's ignoring the glaring warnings above regarding 32bit and using OpenVZ.
[06:14:56] <josh-k> /var/lib/mongodb exists, with one file in it: -rwxr-xr-x 1 mongodb nogroup 6 Jan 5 03:37 mongod.lock*
[06:15:57] <crudson> nemothekid: indexes on 10s of millions are fine.
[06:16:04] <crudson> josh-k: did you remove the old lock file?
[06:16:06] <josh-k> it seems like this lock file is old and has been around since the template was created
[06:16:17] <josh-k> we use chef to provision mongodb
[06:16:23] <josh-k> and it should have shut it down gracefully
[06:22:04] <josh-k> crudson: so i am not sure why either the lock file persists or the service is not shutdown gracefully
[06:29:48] <josh-k> crudson: does this chef recipe look correct to you? https://github.com/travis-ci/travis-cookbooks/blob/master/ci_environment/mongodb/recipes/server.rb#L37
[06:30:25] <josh-k> crudson: we just confirmed that stopping the mongo service does not remove the lockfile
[06:31:07] <crudson> josh-k: does the process go away from ps and no longer shown in netstat?
[06:32:44] <josh-k> crudson: not in ps, not in netstat
[06:40:39] <sent-hil> does dropping a table drop the indexes?
[06:59:38] <jrule> loading simple 1 column file with a header line using mongoimport tsv format into a new collection. every row except the first throws an error complaining aobut a duplicate _id_. insert local.huh keyUpdates:0 exception: E11000 duplicate key error index: local.huh.$_id_ dup key: { : null } code:11000 locks(micros) w:68 0ms How do I get mongo to generate a new ID for each row loaded.
[07:00:46] <jrule> the file looks like my-id\n1\n2\n3\n etc... the \n are real line feeds just used for examplehere.
[07:01:30] <crudson> jrule: are you using 1) --drop or 2) --upsert?
[07:03:37] <jrule> it was a new empty collection. I get the same results for -drop and for -upsert options
[07:09:09] <jrule> sorry 1 column in the input file, not the table. don't mean to be confusing.
[07:10:18] <jrule> It looks like mongoimport is generating a null internal id for each row from the log .. The column with the null value is not my column name I am giving
[07:16:11] <crudson> can you create a paste of the first 5 lines or so - I am running a db update so my server is locked now but can try it when this completes (whenever that will be!)
[07:32:10] <jrule> here is the example: http://pastebin.com/CuCV41Jv
[07:32:56] <jrule> I have the same problem on osx and Centos running this version
[07:37:19] <jrule> you only see the error above when running the server with the --verbose option turned on w/o it the loads say the all succeeded but only one record is actually loaded.
[08:10:54] <jrule> crudson... I just fired up an older version of mongo 2.0.6 on amazon and tried the example, works fine
[08:11:31] <jrule> I think upgraded that test instance to the latest 2.2.2 does not work. I think I need to open a bug with the for this... Do you know how to do that. Never done a bug.
[09:52:28] <jwilliams_> i execute map reduce js with the command `mongo mydb my.js` where contains statement db.collection_name.mapReduce(m,r,{out:"tmp_output", sharded: true}); but mongo throws exception saying "unknown m/r field for sharding: sharded"
[09:53:03] <jwilliams_> what should i passed in so that mapreduce can dump to sharded output ?
[09:53:29] <jwilliams_> checking doc (http://docs.mongodb.org/manual/reference/method/db.collection.mapReduce/#db.collection.mapReduce) sharded:true should be the right syntax.
[09:59:50] <chrisq> NodeX: tracks and albums dont change, but our rights to play them do. So i need to do a lookup somewhere to find if we're currently allowed to play each track in a playlist
[10:29:09] <Gargoyle> With the aggregation framework, is it possible to get the "top x" in a group? For example, lets say scores groupped by category and there are 20+ scores in each category, can you get it to limit to the top 5 in each category grouping?
[10:31:51] <kali> Gargoyle: yes. group, then sort, then limit
[12:30:18] <rawler> NodeX: I've tried with the other index not present..
[12:30:19] <kali> NodeX: that won't change it... it picks the right index, but choose not to set an upper bound
[12:31:08] <kali> rawler: it's certainly because part of the index is multiple, i had a similar case but the value that was unbounded was the source of the multiplicity, so it's not exactly the same
[12:33:47] <kali> rawler: so i have a ugly patch to force the optimizer to setup the up bound on this specific index
[12:34:31] <rawler> oh, that's just not gonna fly here.. :S I highly doubt operations will accept that kind of monkey-patching on the DB-level..
[12:35:21] <kali> rawler: yeah, i am "operations" here too, so i at least i can do whatever the fuck i need :)
[12:35:37] <rawler> just for my understanding, is the problem that there are multiple _keys_ in the index, or that 'pricings.parent_id' is kindof an array?
[12:36:35] <kali> rawler: yes, in that case, the index is tagged as "multikeys", and the optimizer have to ignore some boundaries not to break some requests
[12:37:19] <kali> rawler: it would be better if the metadata info was in the form of "pricings.parent_id is multi" instead of a single boolean for the whole index
[12:37:30] <kali> rawler: but that was too big a patch for me
[12:38:02] <rawler> *ahh* it considers the _entire_ index as multi? that's just wrong.. :)
[12:38:39] <kali> rawler: feel free to open a new jira, link mine for references and give me the id so that i can vote on it ;)
[12:39:48] <kali> rawler: i hate to say this, but i am not happy with the way 10gen has deal with this issue
[12:40:16] <rawler> why not just reopen the one that is there? it's clearly a serious issue, and it's really unresolved?
[12:42:27] <rawler> the ability for embedded documents and indexes of them were the whole reason for us to evaluate mongodb for this application.. if the functionality is broken in this way, I guess I'll have to keep fighting with the current MySQL setup.. :S
[12:44:51] <kali> i can't blame you if you don't but i think it could help me a lot if you wrote that to a jire
[12:54:15] <HDroid> Will a DBCursor load each document entirely or is it lazily fetched? Trying to iterate over all keys, but at that stage I don't need the documents' contents yet.
[13:15:00] <NodeX> var fb='someFunctionName'; fb.call(); <--- works .... var fb='some.dotted.function.name'; fb.call(); <---- fail
[13:15:08] <rawler> oh? about time.. last year I spent there were mostly a waste of time, and from what I heard at least the next few years wasn't much better..
[13:15:41] <rawler> oh, well.. ah bit OT for the channel, I guess.. :)
[13:15:52] <geekie> (I don't know how OT chat is in here, are you in other chans?)
[13:15:57] <NodeX> the trouble with learning to programme at uni is you get some other person's idea on how programming should be done
[13:16:33] <geekie> NodeX: Yeah, in this case, I teached my teacher some told him "Ok this is how I wanna do it" and he said "Fine, do it."
[13:16:35] <NodeX> which generaly makes for BAD design choices and even worse security/performance
[13:17:39] <rawler> NodeX: well, the main problem at BTH at the time was that they spent an ABSOLUTE minimum on paying teachers.. so they only got people that really would not qualify for a real programming job.. *tm*
[13:18:29] <geekie> rawler: Now the ONLY thing that makes the course is the teacher Mikael Ros, he's dedicated in what he does and he loves to discuss solutions on irc and a great ballplank for my ideas.
[13:19:14] <rawler> oh, I think he was there while I were there too.. I remember his name, but not much more..
[13:51:04] <Gues_____> hey guys, im using mongodb in a production enviornment and my collection is about to hit it's 16MB limit. What should I do? Im using the PHP driver
[15:16:36] <emperorcezar> Recommendations on the best way to serve up files from gridfs out to the internet?
[15:18:53] <kali> emperorcezar: some people use the nginx module
[15:20:10] <emperorcezar> kali: Yea. I was looking at that. Seems weird to have to recompile nginx. I try to avoid that. I'd rather have a package.
[15:20:40] <NodeX> nginx is by far the best way should you not want to talk to a scripting language
[15:20:54] <NodeX> plus recompile takes about 15 seconds lol
[15:21:31] <kali> emperorcezar: yeah, compiling nginx is really ok. it's designed for that
[15:21:50] <NodeX> you can even update it wihtout dropping connection
[15:22:03] <NodeX> "make update" instead of makle install iirc
[15:27:33] <emperorcezar> Yea, I don't mind compiling, but then it's really harder to keep up with minor releases, security fixes especially.
[15:28:44] <NodeX> there isn't many if ever for nginx
[15:29:23] <NodeX> even so, it's a small price to pay to have fast file serving
[15:31:01] <marqu3z> Hi guys, do you know some benchmark page about intersection query with mongodb? I'm dealing with that and i'm not sure about performance with big collections.
[15:31:52] <NodeX> perhaps explain the query and we can advise if/when it will bottleneck
[15:32:54] <JoeyJoeJo> I'm using pymongo and I have a list of strings to search for in one field. I use MyQuery['field'] = {"$in":MyList} which only works for exact matches. So I changed my list to be python re objects instead of strings, but it returns nothing. Any suggestions on what I can try?
[15:43:53] <marqu3z> it's not about a specific query. I have all the users logged with a social network (fb for example) so i store their fb id on mongo. Then another user login, and i take his social graph from Facebook, so i have an array with a list of friend ids, and i need to query against that array and find out if the there are users on my database that are friends with the new one. So i need to make this query very often and don't know how well performs mongo with this
[15:45:21] <marqu3z> i think the "$in" operator is the only one i could use
[15:45:33] <NodeX> you'll need a graphing DB to scale that
[15:45:40] <NodeX> it wont scale well at all in mongo
[15:48:27] <marqu3z> i suspected about that, right know i can't switch to another db, if you know some benchmark about that will be very helpful! Thank you for the help!
[16:57:10] <nonent> question about shard-autobalancer. i've got 64 chunks, and after initial data import, 32 shards with shardKey 0-5 are on shard1, while 32 shards with shardkey 6-10 are on shard2. issue is, higher values are more likely to be accessed. will it rebalance these automatically as use continues? a strategy with 0-2, 6-8 being on shard1 and 3-5,9-10 being on shard2 would work much better.
[16:58:35] <nonent> i assumed chunks would be added to shards in an alternating fashion. but somehow they've ended up this way where the first half are all on shard1 and second half on shard2
[17:00:26] <kali> AFAIK the autobalancer only balance on size, not on traffic
[17:01:04] <kali> that's why it is recommended to use shard keys that gives a good, nice spread, but of course, it's not always possible
[17:01:45] <kali> it might be possible to manually rebalance some chunks to get a few chunks with higher value on the other shard
[17:06:27] <nonent> kali: yeah, i've read those notes, but since most of my queries are based on id for this collection, and inserts are few while updates are many, it seemed like the appropriate choice anyways.... well, except for the tendency for newer ids to be accessed more commonly.
[17:08:45] <nonent> i'll look into possibility of manually rebalancing. i don't know why i assumed distribution of chunks would alternate between shards instead of choosing a pivot point and splitting chunks between that.
[17:12:07] <kali> nonent: well, it make sense: first chunk grows, grows, grows, is splitted, one of the half is migrated, and then both side grow at more or less the same rate, are splitted but as the shards are even balanced, there's no reason to waste energy migrating data between shards...
[17:14:19] <nonent> i'll see what the access distribution looks like after this is all done before deciding what to do about it. i guess i also have to eval whether the savings i achieve from not needing to query both shards for _id is worth the possible uneven access.
[17:15:11] <nonent> thanks for the help/running me through it mentally.
[17:29:35] <owen1> I start my mongo with /home/t/bin/mongod --fork --logpath /usr/local/nextgen/push/shared/logs/mongodb.log --logappend --config /etc/mongodb.conf my mongodb.conf is: port = 27017 bind_ip = 10.1.72.132 dbpath = /data/db fork = true replSet = push I can't connect to the console to initiate my replica set - "exception: connect failed". looking the log i see: "couldn't connect to localhost:27017:
[17:29:37] <owen1> couldn't connect to server localhost:27017" and "replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)"
[17:30:49] <owen1> any clues? (btw, it's the primary instance. i plan to have replica set of 2 hosts, for dev only)
[17:33:44] <kali> localhost is 127.0.0.1 and bind_ip is 10.1.72.132
[17:34:30] <owen1> kali: my bind_ip is whatever ifgonfig gave me.
[17:34:36] <owen1> kali: should it be something else?
[17:35:01] <owen1> or should i try to connect with mongo --host 10.1.72.132 . let me try
[17:35:26] <kali> owen1: why do you even bother with a bind_ip on a dev machine ? :)
[17:35:48] <owen1> kali: i am not sure what does it do. let me read it again
[17:42:47] <bean> you may want to learn more about networking before you worry about it.
[17:43:34] <owen1> bean: ok. i'll ommit it for now. thanks
[17:43:59] <bean> cuz i can't really explain if you don't understand the difference between internal networks and external networks
[17:45:00] <owen1> bean: at least i can start googling for something!
[17:47:37] <owen1> i am running rs.initiate() and i never get the cursur back. it's still holding or something. in the log i see: "[rsStart] DBClientCursor::init call() failed"
[17:47:53] <kali> owen1: how long have you waited ?
[17:48:27] <owen1> kali: until i saw the messege? about 1 minute
[17:48:51] <kali> ha. you need to look at what mongod logs
[17:50:38] <owen1> kali: this is the log for starting mongo - http://pastebin.ca/2301085 [rsStart] couldn't connect to localhost:27017: couldn't connect to server localhost:27017
[17:50:47] <owen1> and couldn't unlink socket file /tmp/mongodb-27017.sockerrno:1 Operation not permitted skipping
[17:52:55] <owen1> i assume that my issue is somewhere in starting mongo, and not in rs.initiate()
[17:53:40] <kali> is there a chance there is already a mongo running on this machine (i think the message would be "socket already in use" in that case, but...)
[18:52:33] <vanchi> i can take off one, make it an unused standalone rs with two other arbiters
[19:27:09] <owen1> "killall mongod" on the primary host. and the secondary still shows 'secondary'. isn't it suppose to became primary?
[19:27:45] <kali> owen1: you can't have failover with two servers. you need a third one, at least in the form of an arbiter
[19:48:25] <Virunga> I have a collection of docs like this { _id:'..', day:'..', log: [{time:'..', messages:[..]}]}, and with these params {_id: channel, 'log.day': day}, {'log.$.messages': 1, '_id': 0} the find method returns this { log: [...] }. How can i get only the messages field?
[19:48:36] <Virunga> Could you give me an advise please?
[19:49:20] <kali> just unfold the document with whatever language you're using to query mongodb...
[19:49:57] <Virunga> Yes, but isn't there a query that returns just what i want?
[19:50:39] <Virunga> i mean, the object of field to give to the find method
[19:51:32] <kali> Virunga: not really, mongo always return objects
[19:52:10] <Virunga> An object like this { messages: [..] } would be ok
[19:52:11] <kali> Virunga: you can get them closer to what you want with the aggregation framework, but you'll never reduce them to the the messages strings
[19:52:20] <kali> Virunga: i advise against doing it
[19:52:45] <kali> Virunga: this kind of processing is ok on the client side, which is easily scalable
[19:52:55] <kali> Virunga: so no reason to bother the server with it
[21:11:58] <tqrst> I'm going through the tutorial in the browser shell on mongodb.org, and so far I'm having problems getting the find command to work: db.scores.save({a:99}); db.scores.find() -> db.scores.find() returns [] instead of what I suppose should have been [{a:99}]? Did I misread something?
[21:15:18] <kchodorow_> tqrst: looks like it's broken
[21:24:59] <owen1> kali: oh. i didn't realize that. i guess i'll add a 3rd host on my dev environment, to test it.
[21:26:52] <JoeyJoeJo> How would I remove documents that have a specific field, regardless of what that field contains?
[21:43:52] <owen1> what's the reason replica set doesn't work on 2 hosts? why 3?
[21:45:32] <kali> owen1: because when a member of a two-node replica set can't see its sibling, it can not know for sure if it is not still primary, but hidden behind a network split
[22:22:01] <changerO1Sea> hey all! I got a $slice problem, maybe i'm reading the docs wrong, but I'm trying to just get one arg from the first obj in an array to show up and it's showing up everything the document. can anyone help?
[22:22:15] <changerO1Sea> everything in the document*
[23:01:35] <changerO1Sea> perhaps let me rephrase that question, I only want the first element in an array, instead I am getting the whole doc and the first element, can anyone help?
[23:08:01] <kchodorow_> changerO1Sea: you can say {x:0, y:0,...} for all the other fields in the second arg to find, or use the agg framework
[23:30:52] <jaimef> can you do db.fsyncLock() on the primary?
[23:47:01] <jaimef> can you merely copy an existing rs to a new host and start it up as a different replicaset name?