[01:30:58] <queuetip> i'm new to mongodb and this came up for me and i ended up just updating all the fields :/
[01:45:05] <mcr> is it hard to move to MongoDB.pm 0.702, where MongoDB::MongoClient from 0.45 (with) MongoDB::Connection. 0.45 is packaged in debian, and therefore much easier to deploy.
[01:45:17] <mcr> I'm looking at why Connection() won't connect to IPv6 addresses.
[01:53:31] <mcr> reading source code, no IPv6 support.
[02:25:13] <thesheff17> I'm using pymongo... this works: output = mongo_collection.update({"revision": {"$gte": 1, "$lte": 5}},{"$set": {"status": False}}, multi=True) but this does not output = mongo_collection.update({"revision": {"$gte": startValue, "$lte": endValue}}, {"$set": {"status": checkBoxValue}}, multi=True)
[02:25:35] <thesheff17> any ideas?...do I need to cast the types of startValue, endValue, and checkBoxValue...these are all unicode obj types
[02:29:40] <thesheff17> I got it...need to cast everything to the right type
[09:39:00] <Richhh> i just want to use mongodb as a key value store like redis, how can i best do this? im guessing using collection names as keys, containing a single object value is a bad idea?
[09:39:34] <Richhh> or maybe im using the wrong database
[09:41:00] <zamnuts> Richhh, use a collection to store a single KVP, from there, you could add meta (i.e. tag/label) to further isolate it from other unrelated KVPs
[09:41:40] <zamnuts> Richhh, although you need to be cautious of flooding the mongodb socket (?), upserts will start failing if you're doing too much too fast
[09:42:11] <zamnuts> Richhh, that is... without the right mongodb infrastructure
[09:45:16] <zamnuts> redis is typically in-memory, so it is supa-fast and can handle frequent reads/writes, that's not really what mongodb is for, but w/o knowing your application I am simply supplying the warning
[09:45:55] <zamnuts> Richhh, ^ + when you say "KVP like redis" - that's the first thing i thought of, keep in mind mongodb is essentially a KVP database...
[09:52:48] <Richhh> i want to implement 'sometag':[id,id,id...100 ids] and id:'string' in a way that enables the highest read and upsert throughput
[09:53:09] <Richhh> essentially tag:['string,'string',...] without dupes
[09:58:46] <ruphos> fwiw, redis sets should prevent dupes and sorted sets can also preserve order
[10:00:27] <Richhh> thanks ruphos , any idea zamnuts ?
[10:00:36] <Nodex> Richhh : why don't you just use redis? - that's what it's built for
[10:08:14] <zamnuts> Richhh, in what sense do you want to read/write to a database, whether it is redis or mongodb...? is this for cached data? how important is performance?
[10:08:17] <zen_> 25% locked on average is too high
[10:13:17] <zamnuts> zen_, w/ mongodb, vertical scaling is better than horizontal scaling, i.e. replication sets / sharding opposed to a machine with SSD
[10:13:46] <Nodex> read locks are shared, all they do is stop a write lock from happening and they're fast. It's highly doubtful that it's read locks that's causing your problems
[10:13:50] <zen_> know that, but thought sharding could wait a few months
[10:14:15] <Nodex> zamnuts : you also have the scaling the wrong way round. Horizontal scaling is better
[10:14:32] <zen_> no from mongostat i could say inserts an updates increase my lock %
[10:17:39] <Richhh> zamnuts: throughput is more important than latency (if they are disentanglable) in both reads and writes, writes will be users adding a new string to the specified tag arrays (of strings), reads will be users querying with tags as keys for said string arrays as values
[10:17:43] <Nodex> and or grep nscanned as kali suggests?
[10:18:05] <Scud> Hi, im having trouble with the sort() operation. does mongodb allow me to apply the sort operation on a subdocument? e.g. entry: {foo:[obj1, obj2, obj3]} can i sort regarding parameters contained in objects 1..3?
[10:18:44] <zamnuts> zen_, what is your write concern?
[10:18:46] <Nodex> Richhh : I do similar things to that, I use Redis as a queue/cache in between the mongo persistence, gies me the best of both worlds
[10:19:03] <Nodex> it shows you what is locking your database (the query)
[10:19:21] <zen_> ah, answer to my question... sorry
[10:20:19] <kali> zen_: high figures there show slow queries scanning lot of documents. indexing the queries can help a lot
[10:21:55] <zen_> you know the dex tool. as far as that goes i have no slow queries
[10:28:25] <zamnuts> Richhh, i'm having trouble distinguishing your use case, you want to use it like you would redis/memcached but you are storing multiple TB of data
[10:38:10] <Richhh> zamnuts: i was thinking to store the arrays of indices to the strings in redis, then look them up with mongodb
[10:38:47] <Richhh> just .get() the whole array of indices
[10:39:09] <Richhh> i guess thats not gonna work because you'd have to query every one of them then
[10:40:37] <Scud> ah got it, didnt know i required the aggregate command
[10:40:57] <Richhh> i guess i need to think and research more about this
[10:42:49] <LoonaTick> Hi, I have a question. Reading in the MongoDB documentation I saw that MongoDB is atomic within 1 document. I'm trying to create an atomic lock around some piece of processing code, so I have a collection "lock", with a unique key on 'name'. For debugging purposes I do not intend to clear the documents from the collection when the lock is removed, but just have a 'processing' boolean in the
[10:42:49] <LoonaTick> document. When I update this lock I update the processing field, where name = 'lockname' and processing = false. I was wondering: In concurrent situations, will the update statement always return a correct value if another thread has already updated the row? In other words: With the atomicness, are the criterions of the query checked within that atomic lock in MongoDB?
[10:43:40] <LoonaTick> I mean: will the update statement always return a correct value of the number of documents affected by the query?
[10:43:42] <kali> LoonaTick: you need findAndModify
[10:43:55] <LoonaTick> kali: Thanks very much, I will read in to that
[10:51:01] <LoonaTick> kali: If I understand the documentation properly, I have to do findAndModify on the document. I make it return the old version of the document and check if the processing boolean was false before that query updated the document. Is that correct?
[10:52:40] <kali> LoonaTick: mmmm yeah that's the idea
[13:10:00] <Richhh> so if i do ensureIndex({'k':1},{unique:true}), is mongodb going to be able to lookup those strings, or is it going to iterate through objects to find them?
[13:10:24] <Richhh> Nodex: as i understand, a graphdb can implement a key-value store
[13:10:35] <Nodex> db.foo.find({k:"somestring"}); <--- will use an index
[13:11:03] <Nodex> and "right tool for the job" r/e "can be implemented as"
[13:11:43] <Nodex> by that I mean - use the right tool for the job
[13:16:00] <Nodex> if you mean "bar" with only foo:1 in it then no, not without an aggregation / map/reduce
[13:17:17] <Richhh> binary tree would still not be a lookup like with a KVP, and so still be slow for fetching 100s of objects from among a large number of objects, wouldnt it?
[13:17:45] <Richhh> talking about 100s of randomly distributed objects among a very large set of objects
[13:17:53] <Nodex> the cursor fetches the documents
[13:29:31] <dandre> joannac: in my test case 400kB
[13:35:35] <Richhh> joannac: the data (values for each id) collection size could grow to TBs from the large number of objects, each object being {k:'uniuquek',v:'upTo100Bstring'}
[13:36:16] <Richhh> joannac: the collection size could grow to TBs from the large number of objects, each object being {k:'uniuquek',v:'upTo100Bstring'}*
[16:13:29] <Mmike> Hello! Is there a way to have secondaries 'removed' from the mongodb replset cluster while it's catching up with the rest of the cluster?
[16:14:34] <Mmike> I'm doing a repair that takes cca 6 hours to complete. When I fire up repaired secondary it takes cca 20-30 minutes for it to be in sync with the rest of the cluster - but during that time mongod is accepting connections - other than firewall, is there a way to tell mongod not to allow connections untill it syncs?
[16:48:05] <Joeskyyy> Mmike: You can try a hidden replset member until you're all caught up?
[16:52:06] <brendan6> Does anyone know if there is a way to create an index so that that the query db.collection.find({'foo.0.bar': 1}) uses an index for a document {foo: [{bar: 1},{bar: 2}]}?
[16:52:56] <michaelholley> I'm looking to find an official word of supported distribution of Linux. Any suggestions of where to look?
[16:53:56] <michaelholley> I've looked all over mongodb.org and haven't found anything stated beyond RHEL, EC2, and Azure.
[17:05:55] <Joeskyyy> michaelholley: You should be able to just pull the files from this tutortial: http://docs.mongodb.org/manual/tutorial/install-mongodb-on-linux/
[17:06:09] <Joeskyyy> Granted, upgrades and such are little more painful without package managers
[17:06:21] <Joeskyyy> But it works. (On my OpenSuse install at least)
[17:08:37] <Mmike> Joeskyyy, thnx, will see how that will work. There is an issue with re-election, and my cluster being unavailable for 10-20 seconds, which I can't really have :/
[17:10:10] <Mmike> Joeskyyy, mongod docs say 'do not run arbiter on replicaset member' - do you know why is that?
[17:10:35] <Joeskyyy> Because if that part of the repl set goes down, so does your arbiter.
[17:10:47] <Joeskyyy> Which is pretty essential in an election process if you have one.
[17:11:24] <Joeskyyy> Typically you run an arbiter on a node outside of a repl set member, like say a mongos or something lightweight. Just so you have an extra voting member without having to sync data.
[17:11:57] <michaelholley> Joeskyyy: Thanks for the link. You are right, I would prefer a package manager, but at least there is a manual install method.
[17:16:49] <Purefan> Hi everyone! Not sure Im understanding the docs properly, could someone please telle me if in replication one of my shards (lets say I have 10 collections and 2 shards) goes down will the apps still be able to get info from the db?
[17:17:27] <joannac> do you mean shard or relica set node?
[17:18:07] <Purefan> shard, Im not setting specific replicas so as I understand each shard would be its own and only replica
[17:34:38] <Joeskyyy> Purefan: If your shard goes down, then the data contained in that shard is also inaccessible, you'd need a sharded replset for autofailover and sharding
[17:41:00] <michael_____> Hi all, is it a problem to add tracking events to each of 1000+ embedded documents (e.g. blowing up the whole document?)
[17:41:28] <joannac> you might hit the document size limit
[17:42:17] <michael_____> so is it a better approach to store a referenced document for each embedded document which will contain all the tracking?
[18:18:21] <monmortal> hi gents I added node 4 to a single replicaset cluster but made the mistake of starting it with out my mongodb config file, only the default
[19:04:19] <kali> what ? you still have a working RS... why would you have data loss ?
[19:04:36] <kali> show us the full log of the failing instance
[19:11:49] <treaves> For queries on array fields, why is there no $contains operator?
[19:12:23] <treaves> I have to use $all, and specify an array with a single argument, in order to find documents with an array field that contain a value I want.
[19:13:01] <treaves> Whereas these two (in this case) are symatically the same, I'd think the cost to execute would be much higher.
[19:14:43] <kali> treaves: "a": "b" will match a: ["b"]
[19:15:59] <treaves> kali: I did not realize that. Thanks!
[19:16:42] <treaves> (although that's a bit not-obvious)
[19:18:46] <TheDracle> So, I'm storing time series data in my mongodb.
[19:18:59] <TheDracle> And, at the moment I'm using a flat row by row model like described above.
[19:19:13] <TheDracle> I actually started with a model more similar to what they describe, with embedded blocks of time series data.
[19:19:27] <TheDracle> But the issue is- the data I have is spurious, and comes at any time.
[19:19:34] <TheDracle> And I want it to immediately update into the database.
[19:19:43] <TheDracle> Like, I don't want to cache a minute worth of data before doing an insert.
[19:19:59] <TheDracle> I want to push the very first data point, so people can see it immediately as it occurs in the database.
[19:20:19] <TheDracle> So.. The issue was, I was performing 'update' on an embedded array, and inserting the new values.
[19:20:32] <TheDracle> And it ended up being very very slow...
[19:20:59] <TheDracle> It seems like the only way to make it work well is to basically wait for a minutes worth of data, and do a single insert with a document containing that data embedded internally.
[19:22:06] <TheDracle> With the row by row model, the insertion is fast, but reading it out is slow.
[19:25:19] <whaley> TheDracle: it's slow because mongo has to allocate new disk space and move the entire document over with your appended values added once the document size goes over a certain threshold. It's easily the biggest performance problem I have with my system's usage of mongo at present.
[19:25:59] <TheDracle> whaley, Yeah, I assumed as much.. It also started causing the size of the document to explode.
[19:26:18] <whaley> TheDracle: when you say the row by row model is slow, what is slow about it? have you tried using aggregation?
[19:26:21] <TheDracle> whaley, Have you done any profiling to figure out where exactly it starts to move the document?
[19:26:47] <whaley> TheDracle: one of my coworkers has a pretty detailed ticket with 10gen on it. let me ask
[19:26:47] <TheDracle> whaley, Just the read out of the data is slow- I.E: for the reasons described in the above recommendation on storage of time series data.
[19:26:52] <TheDracle> Since it can store it anywhere on the disk.
[19:27:04] <whaley> TheDracle: you might be able to help that with proper indices
[19:27:07] <TheDracle> When I do a Collection.find() to get all of the data, it has to seek for every document that is dispersed.
[19:33:12] <TheDracle> Nodex, Hm, yeah, I was worried something like that would be necessary.
[19:33:47] <Nodex> I have a rather large server (RAM) dedicated for that kind of thing, then move it to Mongo for persistence
[19:34:05] <TheDracle> I was thinking maybe I could have a process that every hour or something bundles up all of the data into a structure like above...
[19:34:21] <TheDracle> I.E: Pulls the single data points, bundles them into embedded documents, and removes the previous ones...
[19:34:21] <Nodex> I store the raw (json in my case) as files in gridfs and store aggregates in mongo after the 2 weeks (a month in my case)
[19:34:36] <TheDracle> I'm just scared it will do something weird, and cause an efficiency issue elsewhere.
[19:35:00] <Nodex> the rest is all read/written to Mongo in a queue and I use a redis slave for that also
[19:35:46] <Nodex> the pain is that modifying data is a migration :/
[19:36:23] <Nodex> luckily my data is pretty uniform and fits nicely, ironcically the adverse of unstructured data
[19:36:36] <TheDracle> Yeah... It sounds a lot more painful than I had hoped.
[19:37:22] <Nodex> it really depends on your data tbh
[19:39:31] <TheDracle> Nodex, I'll check out Redis..
[19:39:48] <TheDracle> It seems like there ought to maybe just be some sort of formal Redis system for solving this problem.
[19:40:42] <Nodex> for persistence you will want summit like mongo
[19:41:23] <brendan6> Does anyone know if there is a way to create an index so that that the query db.collection.find({'foo.0.bar': 1}) uses an index for a document {foo: [{bar: 1},{bar: 2}]}?
[19:41:53] <TheDracle> I think I'm going to try the sort of garbage collector approach first. I.E: Every hour process comes by, and bundles everything from the last hour into a document, and then trashes the rest of the data.
[19:42:04] <brendan6> explain() tells me that no index is used when I expect db.collection.ensureIndex({'foo.bar': 1}) to be used
[19:45:53] <Nodex> brendan6 : if you have a document {foo: [{bar: 1},{bar: 2}]} and you do db.foo.ensureIndex({"foo.bar":1}); then all of bar should be indexed
[19:46:25] <Nodex> can you pastebin your explain() and your db.foo.getIndexes();
[20:16:51] <NaN> on this collection > http://pastie.org/private/mg70dolewktolhiq937w < how could be the query that get's me all the 'parent' : '23489AJ'?
[20:18:08] <joannac> MongoDB doesn't support searching with a wildcard on keysWhy do you have custom keys?
[20:20:22] <NaN> I don't know, the collection cames from a json, but I suppose I will need to regenerate it
[20:35:13] <monmortal> oit buggers be with a monringstar
[21:14:58] <pinvok3> Good evening guys. I'm writing a small web-based feed application for myself. Ive found that mongodb is a bit more suitable for storing rss data than mysql. My question is, what is the best way to update an document, when new feeds are available? I have a collection for each feed, and only one document. This document should contain an array of ~300 feed items. Newer ones get added, older ones
[21:15:08] <pinvok3> removed. Can someone help me a bit?
[21:15:46] <pinvok3> Or should I create a new document for every feed item?
[21:26:51] <a|3x> i am getting an exception when i am trying to search only for SOME keywords b['collectionname'].runCommand('text', { search: 'some keywords' ,limit:1, language: 'english' , filter: {"removed":{"$exists":false}} }); "exception: BSONObj size: -286331154 (0xEEEEEEEE) is invalid. Size must be between 0 and 16793600(16MB)"
[21:27:19] <a|3x> the strange thing is i only get this error if filter is specified, no filter returns 1 element
[21:28:27] <a|3x> any filter has this effect, even {"_id":{"$nin":[]}}
[21:40:49] <brendan6> Nodex: Here is the pastebin outlining http://pastebin.com/HFZ4eYm8
[21:46:37] <Mmike> Joeskyyy, but, if I have 4 boxes, and I put arbitrer on one, when that box goes down there will still be three node members and they'll be able to vote... no?
[21:49:01] <Joeskyyy> Correct. I think the documentation means something like
[21:49:48] <Joeskyyy> If you have three servers, each having a member of a replset, don't throw your arbiter on one of those three servers. Because then you won't have a majority if the server that goes down is the one with your arbiter on it.
[21:50:38] <Joeskyyy> The point of an arbiter is to be agnostic of your replset functions, not affected by the possibility of something happening with a member of the replset.
[21:50:50] <Mmike> Hm, but with 3 boxes I don't need an arbiter, right?
[21:51:28] <Joeskyyy> Correct. There's a lot of here's and theres with the examples
[21:51:42] <Joeskyyy> but it's just a little pointless to put an arbiter on a point of failure.
[21:51:57] <Joeskyyy> Because you can have two points of failure where there would typically only be one.
[21:52:30] <Joeskyyy> None, in reality. Unless you hit a snag like I mentioned. Again, it's just bad practice.
[21:52:50] <Mmike> I have 4 boxes in replset. I have arbiter on 5th box so I have 5 voting members. If one box goes down, all is gut. If box with arbiter goes down, all is also gut.
[21:53:47] <Joeskyyy> What's the point of your arbiter then? If one box goes down you still have 3/4 to decide.
[21:54:03] <Joeskyyy> Still a majority for the decision.
[21:54:34] <Joeskyyy> Taking the same example above, two nodes go down, one of those two has your arbiter on it. Now you only have 2/5 to vote.
[21:54:45] <Joeskyyy> Where as if your arbiter was on it's own low risk node, youd still have 3/5
[21:55:39] <Joeskyyy> You can take that same example and keep adding more nodes and the results would still be the same, just on a higher scale.
[22:23:01] <tongcx> hi guys, i have a big database and I want to map these data to another collection which just keep some fields, what's the best way to do it?
[22:23:17] <tongcx> i know i can use mapreduce, but the reduce step is not necessary and it will slow things down
[22:28:21] <monmortal> zodb might not be bad either
[22:28:26] <tongcx> hi guys, i have a big database and I want to map these data to another collection which just keep some fields, what's the best way to do it?
[22:28:29] <tongcx> i know i can use mapreduce, but the reduce step is not necessary and it will slow things down
[23:09:22] <tongcx> hi, if i write a script.js and mongo script.js, how do I change the db in script.js, if i write 'use xxx', there is error
[23:13:50] <ehershey> tongcx: try db = db.getSiblingDB('otherdb');
[23:17:03] <schmidtj> Hello. Does anyone have a systemctl service definition file to run MongoDB as a service? I could create one, but this has got to be something someone else has already done before me. Anyone know where I could find one?
[23:23:01] <blizzow> okay, total n00b question here. I have a collection called foo with index bar. I want to do a count of all indexes where the value of bar is set to 1. I failed when trying this -
[23:34:53] <blizzow> monmortal. I'm looking at the little mongodb book and have tried a couple of different ways to query, but am obviously doing something wrong. Pardon my retardedness, I figured I'd ask for some clarification here.
[23:35:44] <Joeskyyy> blizzow: Is the value set to the string value "1" or the integer value 1?
[23:36:20] <blizzow> I think it's integer value 1.
[23:36:42] <Joeskyyy> Try db.domains.count({active: 1}) then
[23:36:47] <Joeskyyy> Without the '' around the 1.
[23:36:53] <Joeskyyy> That would signify a non integer value
[23:39:21] <Joeskyyy> Rather, putting the value IN the '' signifies a non-integer value
[23:41:50] <blizzow> It's returning 0 no matter what I try.
[23:46:31] <blizzow> I posted an image here http://i.imgur.com/C3sX3e1.png of what I'm trying to grab (a count of all records in the active column that are set to 1).
[23:53:07] <schmidtj> monmortal: OpenSuse 13.1, which uses the 'systemctl' startup/service manager package.