[00:03:40] <keeger> but no additional fields get added
[00:04:05] <GothAlice> Then capped collections won't really work out for you unless you account for that by adding padding to the records yourself, then removing that padding. There are some caveats, however, but I do not recall where I read that.
[00:05:13] <keeger> so i saw that you save space by using single character names, and using a mapper to resolve it to more meaningful stuff
[00:05:33] <keeger> does 3.0 help with this in any way?
[00:05:43] <GothAlice> Basically means I need to use a REPL shell from the language my app is written in, and not raw MongoDB shell, but.
[00:06:09] <GothAlice> It does; through native compression support I don't need to do this shortening any more. The keys will effectively be huffman coded throughout the collection.
[00:06:19] <GothAlice> Which means 100% more mongo shell love.
[00:06:41] <keeger> oh, you mean like de-duplication?
[00:06:52] <GothAlice> That's what modern lossless compression is. :)
[00:07:04] <GothAlice> (With various numbers of additional tricks applied.)
[00:07:06] <keeger> ah. i have a different background :)
[00:07:21] <keeger> but i remember backups doing data deduping to save huge amounts of space
[00:07:48] <keeger> is the compression WT or mmap? or both?
[00:09:23] <GothAlice> Consider "tar" and "gz" files. "tar" packs a directory structure together into a single file. The gzip tool then takes that file and allocates space for a "dictionary" of known terms. It then reads in the tar file, and as it sees patterns adds them to the dictionary and writes out the index into that dictionary. To decompress you then just need the dictionary and the index list which is your compressed data.
[00:09:44] <joeyjoeyjoe111> "Index keys that are of the BinData type are more efficiently stored in the index if: the binary subtype value is in the range of 0-7 or 128-135, and the length of the byte array is: 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, or 32." <--- Can anyone provide a link as to why this is? particularly the size of the byte array?
[00:09:44] <GothAlice> WT does compression intelligently, taking into account its own internal structures.
[00:10:20] <GothAlice> joeyjoeyjoe111: Choice of hashing algorithms.
[00:10:59] <joeyjoeyjoe111> GothAlice: Can you link me to source code or documentation that elaborates a bit more?
[00:11:01] <GothAlice> I'd dig through the mongodb source code to find the indexer code that checks those things.
[00:13:53] <GothAlice> (PEEK/POKE let you read and write arbitrary memory locations, it's how you do things like high graphics modes and get around copy protection on "borrowed" games. ;)
[00:26:58] <GothAlice> http://techidiocy.com/write-concern-mongodb-performance-comparison/ — you can work back from writes/sec to msec/write.
[00:27:35] <keeger> that's with jounaling on though right?
[00:27:59] <GothAlice> In various combinations. https://whyjava.wordpress.com/2011/12/08/how-mongodb-different-write-concern-values-affect-performance-on-a-single-node/ is another, but neither of these touch on replica-safe write concerns, which only get slower…
[00:28:01] <keeger> oh wow, i can't read a table, my bad
[00:28:27] <GothAlice> A ha! *This* is the one I was looking for: http://www.nonfunctionalarchitect.com/2014/06/mongodb-write-concern-performance/
[00:30:09] <GothAlice> Performance in 3.0.0 may differ, of course.
[00:31:20] <keeger> urg that last one doesn't tell me which color goes to which setting heh
[00:31:43] <GothAlice> FTA: "The slowest of these are FSYNCED, FSYNC_SAFE, JOURNALED and JOURNALED_SAFE (with JOURNALED_SAFE being the slowest)."
[00:36:16] <GothAlice> Drat, the Bachelor presentation we gave didn't cover how we lazily evaluate scores. :/
[00:36:40] <keeger> one thing i was thinking would be nice
[00:36:40] <BadHorsie> Say I have this doc: {"vlans":[{"name":"1","entries":[{"ip":"10.1.1.1","name":""},{"ip":"10.1.1.2","name":""}]}]} And I want to update the name of ip 10.1.1.1 (Without using vlans.0.entries.0.name, more like find and I guess maybe the $ operator)
[00:36:49] <keeger> would be a shared nothing, so each webserver has a copy of the db
[00:37:45] <keeger> but i dunno about how much time would be spent purely keeping the dbs in sync
[00:37:54] <agenteo> oh man… it turns out it is working for me too… the effect is so subtle that it’s hard to see!
[00:37:54] <GothAlice> BadHorsie: You want http://docs.mongodb.org/manual/reference/operator/query/elemMatch/ and http://docs.mongodb.org/manual/reference/operator/update/positional/#up._S_
[00:38:54] <GothAlice> BadHorsie: No worries. If after reading those and playing around a bit in a mongo shell you still have questions, we're all ears. :) (The examples are pretty comprehensive.)
[00:39:45] <Boomtime> BadHorsie: am i correct in saying you have an array nested in another array (2 deep) and want to update a single array element of the inner-most array?
[00:40:19] <Boomtime> if so, then sorry: https://jira.mongodb.org/browse/SERVER-831
[00:40:22] <GothAlice> Boomtime: Good catch! BadHorsie: Your data design will require you to load the document and issue a somewhat more involved update.
[00:40:38] <GothAlice> Boomtime: +1 on the $n syntax.
[00:41:13] <keeger> well time for the dinner. thx for the help everyone, most especially GothAlice. i'll be back to wrangle out how to setup my mongo dbs :)
[00:41:35] <GothAlice> keeger: Ping me when you get back and I'll gist you some lazy scoring code. :)
[00:45:20] <BadHorsie> I guess that will be a bit slow/expensive...
[00:45:33] <Boomtime> the find can still narrow your search to positive matches only
[00:45:48] <GothAlice> BadHorsie: That is an approach. Be careful to catch the cursor timing out and re-trying from where you left off in case it takes a while, but that shouldn't be a problem if you're typically only updating a small set. Also, what Boomtime says. Note the race conditions and potential for this update to overwrite other updates more broadly than just the field you are updating. (Since you'll need to update the whole sub-array.)
[00:46:43] <BadHorsie> Perhaps I should consider a different structure..
[00:46:50] <GothAlice> It's something you may wish to refactor at some point; give the top-level "vlans" array its own collection, with back-references.
[00:47:05] <GothAlice> Then it's only a one-level deep nesting.
[01:00:50] <agenteo> @GothAlice I got it, so in iTerm there is an option called “Draw bold text in bright colors”, checked by default. If unchecked I get the same behaviour as you. yay
[01:25:37] <stiffler> I have problem with queries. This is my schema: http://codepaste.net/93c7rf . There is also query what I have written so far. But I got too many result. Like statments stops.timetable.hour: '6' doesnt work. There are also different hours as well
[01:27:13] <stiffler> anybody could help? Have I explained my problem good enought?
[01:27:19] <GothAlice> Are you positive you mean that arrangement of type and hour check?
[01:28:20] <stiffler> if I have understood you right, yes
[01:28:21] <GothAlice> The $elemMatch is effectively doing nothing, there.
[01:28:34] <stiffler> I have tried with and without
[01:28:48] <panurge> strange problem.. I'm using mongo 2.6 but db does not have createUser method
[01:28:50] <stiffler> bascily I only want to get timetable
[01:29:00] <stiffler> I dont need any other fields
[01:29:07] <GothAlice> It's finding any document that contains _any_ timetable with type "12" (the string "12") that also contains _any_ timetable with hour=='6' (the string "6")
[01:29:41] <GothAlice> (That also has any stop that includes a lineNr of 817.)
[01:30:48] <stiffler> so how to get timetable with only type: 12 and only hour: 6?
[01:30:51] <GothAlice> Oh bugger me, it's another doubly nested list.
[01:31:12] <stiffler> sorry those my first steps with mongodb
[01:33:27] <stiffler> bascily it returns whole document but I need only timetable depends of parent fields
[01:33:34] <GothAlice> Double list nesting makes things exceedingly difficult to query, and most notably, update. It is generally recommended to split out whatever is double-nested ("stops", which is ironically inside a collection called "stops") so that at no level is there more than a) one level of list nesting, and b) ensure if you have multiple sibling lists that you never need to query more than one at a time.
[01:34:57] <GothAlice> Some nesting can be OK. For example, I nest replies to forum threads within the thread document.
[01:35:30] <stiffler> ok but I really nearly finished this project, and I would like to make step forward with refactoring half of code. is it imposible to make correct query with those nests ?
[01:39:52] <GothAlice> db.stops.find({'stops.lineNr': '817', 'stops.timetable': {'$elemMatch': {'type': '12', 'hour': '6'}}}) — does this describe what you are looking for? (I can not get a firm grasp of what you are looking for from the original query.)
[01:42:34] <stiffler> just there were not timetables with other types
[01:43:44] <GothAlice> We may be running into point three of: http://docs.mongodb.org/manual/reference/operator/projection/positional/#array-field-limitation
[01:45:49] <stiffler> The query document should only contain a single condition on the array field being projected. Multiple conditions may override each other internally and lead to undefined behavior.
[01:46:36] <GothAlice> stiffler: It all applies in your current schema situation.
[01:47:29] <stiffler> hmm... so do I have to change scheme or there are some other ways to solve my problem?
[01:47:47] <GothAlice> Everhusk: Those describe replacing the text you have entered ($price) with the value of the field with the same name, less the $. If you project just "price", the value of the field will be "price".
[01:48:36] <GothAlice> stiffler: You'll have to do a query to find the document that matches and filter the first level, but you'll still get back more timetables than you expect. You'll have to filter those application-side.
[01:49:34] <stiffler> so hard javascript code will be my solution?
[01:49:55] <GothAlice> stiffler: It's a "why have a database" situation. It's certainly worth a little refactoring now…
[01:51:10] <stiffler> ok its nearly 3am in my timezone. I have to think about this what I want to do with this.
[01:51:24] <stiffler> so this schema should be splited on 3 documents
[01:51:45] <stiffler> like stops and timetable yes?
[01:52:24] <GothAlice> Well, you have a collection of documents called stops containing an array called stops which contains an array called timetable. Those last two should be in their own collection.
[01:52:33] <GothAlice> (You can keep those nested; one level is A-OK.)
[01:54:13] <stiffler> so {stops: [], timetable: []} and then i will have to add field to timetable whitch will help me keep relation between stops and timetable ?
[01:56:25] <GothAlice> … is there some reason why you aren't creating collections (other schemas)? That's still using the embedding notation. (What you wrote would be a document with two lists, and you can only query one at a time.)
[01:58:15] <GothAlice> Or I may be misreading this JavaScript. *shakes a fist at Mongoose and all the darn curly braces*
[01:58:19] <stiffler> I thought that nesting is the biggest advantage of mongodb and that how we keep relation
[01:59:02] <GothAlice> Yes and no. It requires restraint and understanding, like any tool. I gave the link that outlines the limitations. One has to work within those limitations.
[01:59:33] <stiffler> so bascily the best solution is to make seprate collections
[01:59:48] <GothAlice> Performing a second query to "pseudo-join" data between collections is not unusual. (In most cases I measured, it's still faster even with the second round-trip vs. MySQL.)
[01:59:49] <stiffler> and manage them with the simillar why like in Sql databases
[02:00:08] <GothAlice> You can nest. Just be sane about it. ;)
[02:00:40] <stiffler> but I dont have to nest and still will be able to use all features of mongodb?
[02:00:46] <GothAlice> Keep it to one list (you wish to query) per document, and don't nest more than one level deep to keep the widest number of options available in terms of querying and updating that data.
[02:00:52] <GothAlice> Yes, though you lose some storage efficiency.
[02:02:47] <stiffler> but tomorrow, I mean today morning (or afternoon :))
[02:02:54] <GothAlice> For a comparison of different storage methods (levels of nesting and typical queries for that type of data) you can see: http://www.devsmash.com/blog/mongodb-ad-hoc-analytics-aggregation-framework — storage can range from ~50MB to more than 600MB.
[02:03:18] <stiffler> ok next one links went to bookmarks
[03:45:57] <GothAlice> We had mixed scoring; energy which continually goes up but gets spent, with a maximum, refractory time, and rate of increase over that time. There was also per-game atomic increment scoring based on certain events or chains of events occurring.
[03:47:19] <GothAlice> For the first case all we need to know is the last time energy was spent, and how much energy there was at that time. Simply $set it whenever it gets spent. :)
[03:48:17] <keeger> i just hate dealing with server setup stuff
[03:49:58] <keeger> then manage locking and transactions
[03:52:04] <GothAlice> For high availability you'll need replicas. This also covers data loss if appropriately spread amongst racks or DCs. Sharding is primarily useful when your "hot dataset" (the records you frequently query) grow beyond the size of available RAM, as this technique allows you to spread the physical records around. However in this situation, you would want to shard replica sets to ensure you maintain multiple copies of your data at all times.
[03:52:04] <GothAlice> (Only sharding reduces you back to single points of failure for your data.)
[03:53:07] <keeger> i was hoping to start off with 2 servers, and then scale wider if demand grew
[03:53:28] <keeger> cheaper, but a replica + shard is at least 4 servers probably
[03:54:38] <GothAlice> That would give you data protection, but not high-availability. Arbiters sorta add availability back, but at the cost that when running degraded (to use a RAID term) your data is at risk.
[03:56:54] <keeger> so for shard + replica, i need a replica, which is at least 2 boxes, and the shard, which is also at least 2, correct? probably need one more for the replica for safety?
[03:58:26] <keeger> and the app would basically hit the shard servers, for doing read/write, and the replica is just for data backup
[05:21:45] <Freman> so... turns out my problem is... mongo can't keep up
[05:36:33] <dcuadrado> so you can kill the consumers to reduce the load of the server
[05:40:09] <dcuadrado> Freman: please tell us a little more about your infrastructure
[05:42:13] <Freman> so, there's one mongo server that acts as the log server for the entire operation, there's 90 odd servers with all sorts of logs pointed at it (from php application logs to syslog)
[05:43:25] <Freman> my app is a nodejs server that connects to 6 other nodes, these 6 nodes run the tasks as directed by the server, the output of those tasks is sent back to my server, which then forwards it to mongo
[05:43:39] <Freman> which is taking it's sweet time to write it
[05:44:44] <Freman> http://pastebin.com/9636rBbz - a chunk from mongostat
[05:46:15] <GothAlice> Freman: Ensure your "logging" collections are "capped" collections.
[05:47:00] <Freman> capped? mine has a background task that prunes the entries to 100 executions per task, there's about 160 tasks
[06:01:22] <Freman> http://pastebin.com/yhGVNhKJ <- most of my code
[06:01:24] <GothAlice> Freman: In the link I provided, it mentions several ways of improving write performance, and the trade-offs in the different approaches.
[06:02:16] <Freman> I'm responsible for a grand total of 9032 documents :D
[06:02:55] <Freman> http://pastebin.com/LsNx1URj my average document
[06:03:33] <GothAlice> And what you are currently doing is kinda not cricket. Having a capped collection of a reasonable size (estimate out for X time or X records, or both) and have a process "tailing" it to catch interesting messages which can be saved in a real collection elsewhere is better. Adding TTL indexes onto that to enforce a X time clearing of that collection would be for bonus points. (No need for you to do what the DB can do for you!)
[06:04:30] <Freman> problem is GothAlice we have things that run every second... and things that run once a month
[06:04:56] <GothAlice> Freman: I've benchmarked running 1.9 million things per second.
[06:05:11] <Freman> honestly, I don't think the db is suffering from my usage... 9032 documents...
[06:05:59] <Freman> I'm saying that my additional load is evidently the proverbial straw
[06:06:40] <GothAlice> Capped collections are more performant for a number of reasons, and the article I linked describes sharding plans that increase performance if you do not wish to use capped collections. (They're a feature added for this purpose, though. See: http://docs.mongodb.org/manual/core/capped-collections/)
[06:06:51] <fhainb> perhaps you want to use a real database instead of a toy?
[06:07:47] <dcuadrado> Freman: can you remind me why do you say mongo can't keep up?
[06:09:32] <dcuadrado> (I'm thinking the bottleneck is in the client)
[06:09:40] <Freman> because I have a task that is outputting logs in real time, once every second, I can see them refresh, however the callback from mongo is ages behind
[06:11:13] <Freman> client writes to my server, it displays them as it receives them, at the same time it fires off a mongo update, then the callback from that update fires when mongo returns
[06:11:22] <GothAlice> Freman: Your approach is backwards. Instead of logging everything somewhere slow, then deleting the things you don't want, typically by number of elements, in a rather slow way, use a type of collection that can naturally limit itself to a certain number of documents.
[06:11:55] <GothAlice> A type of collection that is also inherently faster for logging-type usage.
[06:12:23] <Freman> I'm not logging anything I don't want and as a measure of keeping my collection small I'm deleting older entries
[06:12:45] <Freman> got the most part my entries are "this process ran, it had no stderr"
[06:12:48] <GothAlice> Freman: Capped collections do that for you.
[06:13:44] <dcuadrado> Freman: wondering... would a memory only db work?
[06:13:46] <Freman> I still say it's not really my problem, but a problem with the other 3 trillion log entries
[06:16:04] <GothAlice> Freman: Read the links I have provided, then field your questions. I currently have 5,574,197 log entries in one application's capped collection-based log. That's about one day, so… just under 4K messages per minute or 64 per second.
[06:16:49] <Freman> I did, but I'm not the architect of this mess, I'm just trying to get a task done, one that I'm sick of coming back to after having wasted the better part of this week trying to solve the issues with talking to mongo
[06:17:24] <Freman> whats the fastest way to find out a collections size (either count, or bytes?)
[06:18:17] <GothAlice> For general storage vs. objects. data vs. index stuff.
[06:18:40] <GothAlice> db.collection.stats() for the logical collection-specific stats
[06:21:06] <dcuadrado> Freman: if you just want to get the task job without changing the other parts of the system I would just put an nsq.io and have consumers writing to the db, that way you don't lose data and everything is eventually synced
[06:22:43] <Freman> http://pastebin.com/7XVsS1ar are the collections in this database... my collection is the very last one
[06:26:28] <Freman> thanks dcuadrado, I'll look at that tonight when I get home
[06:27:03] <dcuadrado> it's definitely not your fault
[06:27:36] <Freman> as you can see, my collection is minuscule compared to the others on that poor machine - the only reason mine even cares is because I have a watchdog to restart stuck processes, a process is deemed stuck when it's last run hasn't been updated for a long time, that gets updated at the end of the logging process
[06:28:04] <dcuadrado> you are probably losing data right?
[06:28:17] <GothAlice> I'm leaning towards correctable architectural efficiency with two potential clear answers, one of which eliminates the need for a maintenance process. (The other can expire by time using a TTL index, BTW.)
[06:28:24] <Freman> I'm not no, but the watchdog is going apw
[06:29:09] <dcuadrado> how are you not losing data? where is it stored if the process crashes or is restarted?
[06:29:41] <GothAlice> dcuadrado: With default write concern, in a buffer in the primary's RAM.
[06:29:41] <Freman> the watchdog politely asks the worker that's running the task to not run it again (if it is still running it) and then starts it again on another node
[06:30:07] <Freman> I'd like to build it in such a way that processes that run once a month have 6 months worth of logs and things that run once a second have a weeks worth of logs or something like that
[06:30:48] <Freman> dcuadrado: the whole thing is really nice, I promise, my mongo might not be up to par I confess but I seriously put some thought behind this task management thing
[06:32:11] <dcuadrado> I don't think it's mongo's fault either (although it's probably not the best tool for that job either)
[06:32:45] <dcuadrado> Freman: can you take a look at the log of mongodb?
[06:33:00] <Freman> No I don't believe that either, I just think I've reached the limit of what this install can handle with the way it is being used
[06:33:12] <dcuadrado> is there anything suspicious there?
[06:33:45] <Freman> probably, I'm afraid I don't have time right now - my carpool is waiting
[06:33:56] <Freman> which means I have to come back and sort this out tomorrow
[06:34:19] <GothAlice> https://gist.github.com/amcgregor/4207375 < a distributed task worker system, including scheduled tasks, using MongoDB. This was the 1.9 million calls per second thing I mentioned. (That's end-to-end on a single host, sharding on the task collection would improve performance somewhat.)
[06:35:28] <Freman> we've got cron style and persistent style
[06:35:42] <Freman> I'd love to talk more (and solve this) but I really have to split
[06:35:57] <Freman> thanks for your patience I'm going to look at NSQ tonight when I get home
[06:36:45] <dcuadrado> mongodb is not very good for worker systems either
[06:37:12] <dcuadrado> Freman: i don't think nsq is gonna help
[06:38:13] <GothAlice> dcuadrado: Production systems would argue the mongodb point there.
[06:44:35] <dcuadrado> yeah I've used it on production as worker system several times and while it works it's not the best tool for the job
[06:45:17] <dcuadrado> I shouldn't ave said it's not very good
[06:46:46] <GothAlice> 1.9 million RPC calls per second… on a single host before any attempt to scale via sharding, with just 2 task producers and 4 workers is nothing to sneeze at. I should re-benchmark on modern hardware (that stat is… 3.5 years old) and with different sharding strategies to compare. Hmm. :3
[07:50:15] <Freman> nsq peobably wont (yay home) but it'll shuffle the problem further away from me :)
[08:29:49] <imjacky> Hi, I am using mongodb to do LBS query. I have met a problem on pagination. When I use skip and limit after find, some records with the same distance will show in both current page and previous page, which is not I want. So I wonder if there is a way to solve my problem. thx
[08:34:31] <imjacky> Hi, I am using mongodb to do LBS query. I have met a problem on pagination. When I use skip and limit after find, some records with the same distance will show in both current page and previous page, which is not I want. So I wonder if there is a way to solve my problem. thx
[08:40:54] <imjacky> uhm, anyone could help solve my problem?
[08:47:07] <hdon> hi all :) my mongo build is huge! it looks like every command built is statically linked with ~200MB of libs. how can i tell mongo to build dynamic binaries?
[08:48:09] <hdon> oh or... well it looks like the 'stripped' dir contains much smaller copies... maybe it's really all debug info
[09:14:45] <bo_ptz> Hi all how can I get from mongo last 10 document
[09:18:20] <mocx> i'm using mongodb 3.0 for debian which i installed using the package manager repos as described in the documentation
[09:18:44] <mocx> what setting in the mongod.conf file do i need for the new wiredTiger storage engine?
[11:20:09] <_QGuLL_> hi, i'm quite confused with the dropDatabase() : it doesn't seem to clean the files in dbdir : i still have files related with my droped db, and still used by the daemon (except if i restart mongo) : is that normal ?
[11:20:33] <_QGuLL_> and if i recreate the dropped db, i've some auth problems
[13:45:13] <Cygn> Hey guys, i am using this statement to count elements of a subarray of my documents. The documents are selected by attributes of the subarray, afterwards the subarray is unwinded and the elements are counted… but i want only the elements to be counted which fit the criteria i also use to select the documents… http://pastie.org/9998972 - how would you realize that?
[14:56:15] <d0x> Can someone explain my why a map that initilized with arr[key]=value gets broken when sending it through the scope of map reduce job? This reproduces the problem: http://pastebin.com/7yfU0bGL
[14:56:50] <pamp> i create this method, but the result of var parent is not what i expected
[15:59:34] <SpartanWarrior> hello guys, I installed a brand new replica set, does anyone know how do I set the --replSet arg to mongod when using the upstart scripts @ ubuntu?
[16:03:38] <coalado> hi. Is there anything new with mongodb 3 and authentication?
[16:04:17] <coalado> I cannot auth for example with mongovue or the java drivers
[16:34:45] <Redcavalier> Hi, I got a quick question regarding mongodb clients and replicasets.
[16:37:36] <Redcavalier> Basically from what I tested, when you shut down a primary in a replicaset, the client automatically connects to the new primary. How is that done?
[16:37:51] <Redcavalier> Is the client constantly aware of all the servers in the cluster?
[16:45:48] <Redcavalier> replace cluster by replicaset, sorry
[16:46:00] <Redcavalier> also, where would that information be stored?
[16:50:03] <MacWinner> Hi, is there a quick way to see how often a specific index is being used? I noticed a bunch of indexes on one of my collections, and I feel like half of them probably arent being used
[16:51:26] <gregf_> https://gist.github.com/anonymous/68b85c24f9bd643f9c59 <== eric
[16:53:22] <quattr8> Redcavalier: as far as i know the client caches the replicaset members, but for the php driver for example it is recommended to provide a seed list with all members
[16:56:03] <Redcavalier> quattr8, I see, so if the primary fails, it will automatically go through its cache and try the next server. Out of curiosity, how often is this cache refreshed?
[17:00:43] <quattr8> Redcavalier: I think the php driver and most other drivers cache the information on the first connection
[17:02:23] <Redcavalier> quattr8, ok, so if we add a node to the mongodb replicaset, then it would require the client to be restarted to reinitialize its cache and add the new member?
[17:03:59] <quattr8> Redcavalier: I think i read somewhere once that the information is cached for 5 minutes but can’t find anything on it anymore.
[17:04:56] <quattr8> Redcavalier: Probably best to connect using all replicaset members so you won’t have to wait for cache to change or reinitialize
[17:12:35] <Redcavalier> quattr8, thanks, that helps to answer our questions
[17:41:16] <pamp> its possible make a projection inside a find().forEach
[17:41:55] <pamp> something like, make a query (find) do some operations in each document inthe result set and then make a projection
[18:34:43] <mocx> i'm using mongodb 3.0 that i installed from the debian package managers specified in the documentation, this install comes with a /etc/mongod.conf file that is an ini file
[18:34:58] <mocx> what is the ini option to change the storage engine to wiredTiger?
[18:36:52] <MacWinner> if 2 indexes have teh sample beginning part, is it true that you can delete the shorter index?
[18:37:21] <MacWinner> eg: note.hash_1_verb_1_note.slidecount_1 and note.hash_1_verb_1_note.slidecount_1_note.ext_1
[18:42:40] <pamp> I've a collection like this : { _id:1234321 , P: [ { k : "a" , v : 123 } , { k : "b", v:321 } , { k:"c" , v:345}, { ... } , { ... } ] }, How can I get the field "v" querying for the field "k" ? I want know what is the value "v" for the k:"a". How can I do that?
[18:44:02] <fhainb> query for k:a and pick up the value for v from the result
[18:45:21] <StephenLynx> you want to output a value of a field as the value of another field?
[19:12:08] <pbbunny0801> but it's really strange, how can i find where mongod is getting the dbpath info from, outside of the mongodb.conf
[19:12:31] <pbbunny0801> it's getting the dbpath=/data/db/ somewhere, and it's what's causing the first issue
[19:12:34] <GothAlice> pbbunny0801: The command-line. Anything in a conf file can be on the command-line.
[19:12:58] <GothAlice> Also, AFIK, that's the default path. http://docs.mongodb.org/manual/reference/program/mongod/#cmdoption--dbpath
[19:13:25] <GothAlice> pbbunny0801: So basically you currently _aren't_ specifying a dbpath, and you probably should. ;)
[19:13:27] <pbbunny0801> GothAlice, strange, because then the mongodb.conf file has /var/lib/mongodb/ as the default path
[19:13:50] <GothAlice> pbbunny0801: Ah, but MongoDB doesn't use a config file unless one is specified on the command-line. http://docs.mongodb.org/manual/reference/program/mongod/#cmdoption--config
[19:14:04] <GothAlice> (Note the lack of a default value for that one.)
[19:15:03] <pbbunny0801> so what's usually the way to mongod to run during startup, using the actual command and not services/
[19:15:06] <girb> please help …. I already have a 1 replica set named rs0 { 1 primary with secondary }
[19:15:06] <girb> now want to create a shard cluster of another 3 replca set { 3 primary with 3 secondary}
[19:15:06] <girb> so my confusion is in my new 3 replica set should I give the same replica name rs0 or shoud i give rs1, rs2 and rs3 for each
[19:15:48] <GothAlice> pbbunny0801: Depends on platform. Usually some combination of mongod -f /etc/mongod.conf and a daemon management tool like start-stop-daemon, launchd, etc., etc.
[19:17:02] <GothAlice> girb: We use r01.s01.db.example.com server naming for our sharded replica sets. (Since replicas are contained within shards, replica number for the set is the first domain element. Second element counts the shards.)
[19:17:50] <GothAlice> Thus your first cluster would have r01.s00.db.example.com and r02.s00.db.example.com, and your second cluster would be r01 and r02.s01.example.com through s03.example.com.
[19:18:24] <mocx> GothAlice: is there an INI file config option for the storage engine?
[19:19:13] <GothAlice> mocx: Yes, but you can't really change that value with any existing data on that node. http://docs.mongodb.org/manual/reference/configuration-options/#storage.engine
[19:22:29] <mocx> i'm simply trying to migrate to yaml first
[19:22:41] <mocx> and be able to use sudo service to stop/start/restart
[19:22:43] <GothAlice> mocx: Pastebin/gist your console output when running mongod and contents of the mongod log, if available.
[19:23:12] <GothAlice> (I.e. try to manually start the service, remember -f /etc/mongod.conf option, and capture the output. Once it runs manually, we can worry about the platform automation.)
[19:24:53] <pbbunny0801> how do you fork that to the background though
[19:25:51] <GothAlice> For testing, one generally doesn't. (Forking makes tracking down output harder.) But in production, either the daemon management tool (start-stop-daemon) worries about forking, or http://docs.mongodb.org/manual/reference/program/mongos/#cmdoption--fork (or the config option version of that) are specified.
[19:43:49] <pbbunny0801> so I guess I can't use the service command for mongodb right now
[19:45:09] <GothAlice> pbbunny0801: Certainly should be able to.
[19:45:39] <pbbunny0801> i mean i can do service mongod start but might be a noob question how can i have it take settings/values
[19:46:20] <GothAlice> pbbunny0801: Same as above: examine the init.d script that "service" will be calling, identify the configuration file it uses (/etc/mongod.conf, /etc/conf.d/mongod, etc.) and go from there.
[19:49:03] <mocx> GothAlice should the log file say that it's started properly?
[19:50:09] <GothAlice> mocx: Generally yes. I haven't adjusted the verbosity and see something similar to the following after successful startup: [initandlisten] waiting for connections on port 27017
[20:01:37] <fabiobatalha> it is a basic mongod.conf file.
[20:01:39] <GothAlice> fabiobatalha: The very first line of that indicates to me this is not the actual contents of the file, or if it is, it's missing its top half.
[20:02:07] <GothAlice> Additionally, disabling the journal is strongly discouraged.
[20:02:20] <mocx> thanks for your help GothAlice everything seems to be running smoothly
[20:03:05] <fabiobatalha> it is like: http://docs.mongodb.org/manual/reference/configuration-options/#config-file-format
[20:03:55] <GothAlice> fabiobatalha: The config looks fine except for the extraneous processManagement section (since you're not forking, that PID file path is ignored) — what's the output you get when attempting to start the service?
[20:04:06] <GothAlice> (Both console output when starting, and the contents of the mongodb log file.)
[20:04:40] <fabiobatalha> Starting mongod (via systemctl): Job for mongod.service failed. See 'systemctl status mongod.service' and 'journalctl -xn' for details.
[20:05:24] <GothAlice> fabiobatalha: Cool. Let's simplify for a moment. Run "ps aux | grep mongod" to see if it's already running.
[20:14:44] <GothAlice> Great that you've found it. :)
[20:15:35] <GothAlice> fabiobatalha: When diagnosing issues hidden by many levels of abstraction and automation, the trick is to always simplify down to the minimum complexity needed. Then, and only really then, do problems become obvious.
[21:03:47] <hackel> Is anyone aware of an embedded MongoDB implementation for PHP integration testing? Looking for something like this Java solution: de.flapdoodle.embed.mongo
[21:04:01] <djam90> So I've just read mongodb.org page, and I am wondering if my organisation would benefit from using some form of No-SQL
[21:06:36] <djam90> My organisation sells cars and vans (used and new). We store vehicles, and their associated data (linking to lots of tables for things like specification etc)
[21:18:50] <girb> please look and the end eventhough I enabled sharding by sh.enableSharding("test1.test_collection") … it still show "sharded" : false
[21:21:56] <mordonez> this gives me "Can't canonicalize query: BadValue unknown top level operator: $oid"
[21:22:33] <Boomtime> girb: 'enableSharding' only enables the ability to shard collections in the named database, now you need to shard the collection
[21:22:38] <GothAlice> mordonez: Two important points: first, ObjectIds are not strings. Best case you're doubling the storage space, worst case you're mixing the two types and all sorts of badness will ensue.
[21:30:26] <mordonez> but it gives me "Can't canonicalize query: BadValue cannot nest $ under $in"
[21:32:31] <Boomtime> mordonez: congratulations, you have found a bug in the shell
[21:32:41] <Boomtime> you should use ObjectId instead
[21:33:19] <Boomtime> also, if you like, you should raise a server ticket quoting the query you tried and a pointer to the docs indicating it should permit the extended json
[21:33:34] <GothAlice> Boomtime: I never expected the JSON-encoded form to be directly accepted. :/
[21:34:09] <Boomtime> docs says it is, so there is a bug somewhere, if it's not in the shell, then it's in the docs
[21:34:33] <Boomtime> i will make a note to check up in a couple hours, if no tickets have been raised along these lines i will raise one
[21:37:01] <Boomtime> mordonez: actually, the docs do cover this.. sort of: http://docs.mongodb.org/manual/reference/mongodb-extended-json/#input-in-strict-mode
[21:37:21] <girb> I did a ID based hasing on test1.test_collection .. does chuncks exist on shard0002 and shard000 ?
[21:37:41] <Boomtime> shell parses it but does not translate the type, so the server gets to see it, and handles it like an operator.. dies (correctly)
[21:40:36] <Boomtime> girb: apparently your collectio is not large enough to split, in the shell, please run: db.getSiblingDB("test1").test_collection.stats()
[21:40:53] <Boomtime> girb: ok, you got it sorted?
[21:41:13] <girb> Bootime: I got it through db.test_collection.getShardDistribution() .. is of only 30MB
[21:45:20] <GothAlice> mordonez: json.loads(s, object_hook=bson.json_util.object_hook) — in Python, using PyMongo, one must decode the JSON first. There should be comparable approaches using other drivers.
[21:45:46] <mordonez> yes, I finally get it, thanks GothAlice
[22:41:56] <erewh0n> I'm trying to set up a 3-node repl cluster (pri, sec, arb) and the sec node reports "replSet error loading set config (BADCONFIG)"
[22:50:11] <Boomtime> erewh0n: can you pastebin/gist your replica-set config
[22:58:10] <erewh0n> I confirmed that all 3 servers can reach each other. The secondary (the one with the issue) is showing a ton of connect/disconnect activity in the log from the primary (which might just be the primary doing a check a few times every second to see if the secondary is ready).
[23:00:03] <erewh0n> the rs.status() from the primary reports "still initializing" for the secondary.