PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 5th of June, 2012

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:07:38] <radicality> Hi. Does anyone here have any experience/tutorials using amazon's elastic map/reduce for mongodb ? I can't find even the simplest example :S
[00:22:25] <acidjazz> [Tue Jun 05 00:20:48 2012] [notice] child pid 23477 exit signal Segmentation fault (11)
[00:22:26] <acidjazz> still
[00:22:32] <acidjazz> ive tried 2.0.5 2.0.6 2.1.1
[00:26:53] <acidjazz> trying php drivers 1.3.0dev 1.2.10 etc etc
[00:26:57] <acidjazz> ive tried all combinations
[00:27:00] <acidjazz> still seg faults
[00:29:45] <acidjazz> it was my code
[00:29:55] <acidjazz> i was passing GridFS a mongo obj and not a mongoDB ob j
[00:29:57] <acidjazz> obj*
[00:30:16] <acidjazz> i blame dstorrs
[00:33:09] <acidjazz> so now that ive pulled a file out of the gridfs how do i get the acutal file
[00:37:38] <acidjazz> ah getBytes()
[00:38:38] <acidjazz> yaay more segfaults
[00:38:41] <acidjazz> w/ getBytes
[00:39:24] <jcrew> please tell me more
[00:45:57] <acidjazz> heres my exact issue http://pastebin.com/cw6T3epj
[00:46:55] <acidjazz> Tue Jun 5 00:45:24 [conn2] bravo.fs.chunks Assertion failure false db/key.cpp 409
[00:49:04] <acidjazz> i figured it out
[00:49:09] <acidjazz> i had write issues in /var/lib/mongodb
[00:49:14] <acidjazz> which broke the db
[00:49:21] <acidjazz> so i fixed the write issues then did a repairDatabase();
[00:49:32] <acidjazz> i blame dstorrs again
[00:51:16] <dstorrs> acidjazz: what did I do this time?
[01:00:29] <acidjazz> dstorrs: yall segfaultins my apaches
[01:01:16] <dstorrs> <voice = "acidjazz">u r in mah mongo segfaultin' mah apaches</voice>
[01:01:23] <dstorrs> FTFY ===^
[01:02:09] <dstorrs> glad you fixed it
[01:31:32] <sebastiandeutsch> Hi I'm running a mongodb with the following stats: mapped 20.1g vsize 40.9g index 2.4g - queries are getting slow lately. I have server timeouts etc. Are there any parameters I can tweak beside the default config?
[01:33:07] <sebastiandeutsch> the number of faults is not very high, but the locks look quite high
[01:33:53] <dstorrs> sebastiandeutsch: are you doing a lot of map reduce?
[01:34:21] <sebastiandeutsch> dstorrs: no but we need to identify docs by string (and we have 8mio of them)
[01:35:00] <dstorrs> and you're sure it's a DB issue and not a memory or disk I/O issue ?
[01:35:12] <dstorrs> I know you said locks were high, just wanted to make the suggestion
[01:36:20] <sebastiandeutsch> dstorrs: currently mongo uses 31% of ram
[01:37:25] <sebastiandeutsch> dstorrs: and any suggestions welcome ;-)
[01:38:34] <dstorrs> hang on, vsize is 40+G?
[01:38:41] <dstorrs> so you're 20G into swap?
[01:39:33] <jamescarr> j #mongoosejs
[01:39:53] <dstorrs> jamescarr: forgot the / :>
[01:40:13] <jamescarr> I know :)
[01:40:28] <dstorrs> oh. heh.
[01:43:06] <sebastiandeutsch> dstorrs: Yes… that is probably the übeltäter ;-)
[01:44:30] <sebastiandeutsch> dstorrs: Would sharding of the collection help to reduce the size?
[01:51:14] <dstorrs> sebastiandeutsch: yes
[01:53:17] <dstorrs> sebastiandeutsch: in that it would move some of the data onto other machins
[01:53:29] <dstorrs> sharding on the same box wouldn't help, obviously
[01:54:30] <sebastiandeutsch> dstorrs: understood. thx.
[04:34:43] <OsamaBinLaden> whats up with so many posts on HN, hwere ppl are moving away from mongo ? :P
[04:37:25] <wereHamster> link?
[04:38:42] <OsamaBinLaden> http://www.zopyx.com/blog/goodbye-mongodb
[04:38:46] <freezey> recommended mount settings for physical hardware? ssd
[04:39:03] <OsamaBinLaden> http://news.ycombinator.com/item?id=3982142 - and there are lots more
[04:48:29] <OsamaBinLaden> anyway, i'll stick to mongo :P
[04:53:32] <dstorrs> OsamaBinLaden is right. I've been seeing more hater posts around lately
[06:01:57] <hdm> quick question, trying to add a generic MR script for counting unique fields, this seems to blow up: emit(eval("this." + fname), { c : 1 }); in the map routine, fname is defined further up in the same script
[06:02:11] <hdm> do i need to use a function factory instead?
[06:02:50] <hdm> can a MR map function access vars in the scope of the caller basically
[06:03:01] <hdm> looks like no, but figured i should check
[06:03:08] <dstorrs> hdm: I think you want the 'scope' variable
[06:03:33] <dstorrs> http://www.mongodb.org/display/DOCS/MapReduce
[06:03:40] <dstorrs> grep for 'scope'
[06:03:52] <dstorrs> if I understand what you want, that should do it
[06:04:50] <hdm> awesome, thanks!
[06:04:57] <dstorrs> np
[06:37:31] <henrykim> could we change a bucket size of b-tree?
[06:37:45] <henrykim> I believe it's default value is 4k.
[06:37:58] <dstorrs> henrykim: could you provide some context on that?
[06:40:12] <henrykim> dstorrs: hi, my question is about it's default size of b-tree node.
[06:40:38] <philnate> why you need to have a bigger bucket size?
[06:41:14] <dstorrs> henrykim: in 1.9.1 it was 8192 https://jira.mongodb.org/browse/SERVER-993
[06:41:24] <dstorrs> if anything, it would have gone up since then
[06:41:30] <henrykim> philnate: I don't need more bigger size of bucket size. I am just wondering the fact.
[06:41:46] <henrykim> dstorrs: thanks.
[06:41:51] <philnate> https://jira.mongodb.org/browse/SERVER-993
[06:42:01] <philnate> nah doubt that you can change the bucket size
[06:42:14] <henrykim> The situation I met is I set url as a shard-key. the average of size of it is about 70 bytes.
[06:42:23] <philnate> ah sorry, didn't locked at previous lines
[06:42:36] <philnate> but you can have long ones...
[06:42:57] <philnate> but URLs longer than 8k?
[06:43:10] <henrykim> but, the performance of mongo shards was getting slower and slower after 1 or 2 hours later
[06:43:34] <philnate> because of what, Bucket size?
[06:43:35] <henrykim> philnate: nope. less than 100 bytes.
[06:43:59] <philnate> so where does this performance degration come from?
[06:44:36] <henrykim> I guess size of id is quite long. this cause current situation
[06:44:39] <dstorrs> henrykim: FYI, I wouldn't use anything <v2.0 http://blog.mongodb.org/post/10126837729/mongodb-2-0-released
[06:44:52] <dstorrs> Most of those feature I would not consider to be optional
[06:45:06] <dstorrs> YMMV, though
[06:45:48] <philnate> hmm, do you use custom ids or default ones?
[06:46:11] <henrykim> I am using my custom id which is URL value itself.
[06:46:23] <henrykim> _id == url
[06:46:26] <philnate> ok but then it will be only about 70byte
[06:46:36] <dstorrs> henrykim: _id == md5(url)
[06:46:37] <henrykim> yeap!!
[06:46:39] <dstorrs> ?
[06:46:44] <philnate> don't think that this is causing huge problems
[06:46:53] <henrykim> _id == url itself. I didn't change it .
[06:47:06] <dstorrs> no, that was a suggestion
[06:47:17] <henrykim> dstorrs: thanks ;)
[06:47:20] <dstorrs> that way you don't need to worry about this
[06:47:31] <philnate> with hash as id you may come to the point of collisions
[06:47:48] <philnate> although this is quite theoretical, you may encounter it
[06:48:09] <philnate> so you need to consider appropriate conflict resolutions
[06:48:10] <dstorrs> bloody unlikely
[06:48:30] <philnate> dstorrs: yes, but it may hit you
[06:48:31] <dstorrs> and if you're really worried about it, append an epoch
[06:50:45] <philnate> dstorrs: where is this epoch added? to the hash or url?
[06:51:16] <dstorrs> _id = md5(url) . epoch()
[06:52:15] <philnate> and you would do primary key lookups through regex?
[07:08:43] <henrykim> dstorrs: Is a md5 algorithm able sure to generate unique keys perfectly?
[07:09:44] <henrykim> dstorrs: for example, currently, I got 100 billion urls now. If I changed them to md5 values, it is all different?
[07:11:04] <henrykim> dstorrs: md5 has 128 bits array size to keep a hash value. I am sure that it's enough size to keep my datas. but, it's algorithm. now, I have no idea about that.
[07:15:30] <philnate> http://en.wikipedia.org/wiki/MD5
[07:16:26] <philnate> as it's only a hash algorithm there may be collisions, so you may end up having two or more URLs with the same hash, that can happen with every hash algorithm, so you have to live with that.
[07:16:57] <philnate> Although as we mentioned it's quite unlikely to happen you may encounter that two URLs hash to the same value
[07:21:23] <philnate> henrykim: whats your system setup and your query paths?
[07:21:54] <philnate> when the performance degration started what was happening in your system?
[07:23:42] <henrykim> philnate: I setup 3 shards, 5 routers.
[07:24:12] <philnate> ok how is the memory/drive utilisation?
[07:24:26] <philnate> have you looked into mongostat to see some stats about your system
[07:24:31] <henrykim> each server has 8G memory. and it's disk capability is over 600G.
[07:24:44] <henrykim> yes.
[07:25:01] <philnate> ok
[07:25:03] <henrykim> I monitored the performance decrease from mongostat.
[07:25:10] <philnate> so when did it start?
[07:25:21] <henrykim> from over 2500 TPS to 200 TPS.
[07:25:27] <philnate> you just filled your db and then when the memory was filled it started to decrease?
[07:25:45] <henrykim> I tested it from 20PM to 8AM (almost 12 hours).
[07:25:51] <philnate> whats your idx miss ratio?
[07:26:07] <henrykim> How cloud I monitor it?
[07:26:14] <henrykim> how cloud I...
[07:26:18] <philnate> so when you started testing your system had already 100billion entries?
[07:26:29] <philnate> mongostat has a field called idx miss % or so
[07:26:50] <henrykim> yes miss % is 0
[07:26:52] <philnate> I guess you need to execute that on your actual mongod system
[07:27:02] <philnate> can you paste some lines of your mongostat
[07:27:14] <henrykim> search-ddm-test1.daum.net:27018 0 0 0 0 0 3 0 38.1g 77.5g 4.71g 0 0 0 0|0 2|0 186b 4k 474 rs_1 M 20:13:55
[07:27:25] <philnate> and maybe some example document you're storing (pastebin)
[07:27:33] <henrykim> yes this line is first one of mongostat.
[07:27:42] <henrykim> ok.
[07:28:20] <philnate> could you pastebin a few more lines together with a header line? and please tell what your queries are
[07:29:32] <henrykim> each document is json-styled.
[07:29:44] <henrykim> the average size of it is about 6k.
[07:30:58] <henrykim> please, reference this one http://pastie.org/4030179
[07:31:42] <henrykim> docUrl is _id.
[07:32:11] <philnate> this is mapped through some driver/mapper?
[07:32:33] <philnate> are those urls always starting with http://blog.daum.net/?
[07:32:48] <henrykim> philnate: sorry, total count is not 100 billion. 0.6 billion is current one.
[07:33:01] <henrykim> prefix url pattens are quite denamic.
[07:33:07] <henrykim> dynamic
[07:33:16] <henrykim> some one is like that.
[07:33:34] <henrykim> another one is http://{username}.blog.me
[07:33:58] <henrykim> another one is http://{username}.tistory.com/{articlenumber}
[07:34:07] <philnate> ok
[07:34:43] <philnate> so some minor improvement could be to remove http:// from your id, as this isn't given you anything
[07:35:12] <philnate> that would save you 7 bytes, in total that would be 14 for bson and index
[07:35:31] <philnate> and what queries do you perform on those?
[07:35:43] <philnate> those documents
[07:37:16] <henrykim> find any documents from _id(url) <--- this is all we need currently.
[07:37:46] <[AD]Turbo> hola
[07:38:03] <philnate> so simple _id = URL and no in (URLs) or prefixed lookups?
[07:39:01] <philnate> so I may missed it, but can you name it what happend when the performance degration started. When/What did you monitor where you encountered this problem?
[07:39:29] <philnate> Did you just monitor the normal system doing daily business, or was it importing those data, or whatever
[07:41:04] <philnate> btw from what I saw from your mongostat, it looks like only two queries where sent to the server
[07:42:45] <henrykim> from mongostat updates/queries data, I draw the performance graph. I can monitor the total performance is about to half of the max performance after 1 hour.
[07:43:34] <henrykim> this system is storing all documents from our services. we need to store it permanently.
[07:43:50] <henrykim> but, we can restore it if we missed it by any reasons.
[07:43:52] <philnate> maybe, but you have to consider the other values as well, if no queries enter your system you may see less queries executed
[07:44:41] <henrykim> yes sure, currently I am testing the performance of mongodb. in a real service, we need to ensure several indexes more.
[07:45:08] <philnate> I would like to help, but I'm missing somewhat performance outputs
[07:45:57] <philnate> so how did you generate the load for your system?
[07:46:04] <henrykim> yes.
[07:47:07] <henrykim> roughly, load averages are about 2~4 of each mongod server.
[07:47:34] <henrykim> cpu usage is about 20~30%
[07:47:52] <philnate> please pastebin some mongostat outputs
[07:47:54] <henrykim> almost cpu usage is from iowait.
[07:48:13] <philnate> so I guess its random, access to your documents?
[07:48:18] <henrykim> yes.
[07:48:29] <philnate> is this realistic?
[07:48:31] <henrykim> random INSERT.
[07:48:35] <henrykim> yeap.
[07:48:47] <henrykim> we cannot guess users' updates in the future.
[07:48:52] <philnate> random insert and read?
[07:48:56] <henrykim> yeap
[07:49:10] <philnate> so whats your load test, only inserts or reads as well?
[07:49:24] <henrykim> I am testing all of it.
[07:49:27] <philnate> I don't think that reads are fully random as well
[07:49:37] <henrykim> but, the scenario we are talking about is only INSERT.
[07:49:39] <henrykim> randomly insert.
[07:49:57] <philnate> ok
[07:50:05] <henrykim> mongostat is quire long. (43000 lines).
[07:50:24] <henrykim> Could I paste it on paste.org?
[07:50:45] <philnate> can you show some lines around the position where performance got down, so something around 100-200 lines at most
[07:50:46] <henrykim> the file size is about 80M
[07:50:54] <henrykim> ok.
[07:51:53] <philnate> basically without looking at your stats, you'll have two problems with random inserts as soon as you hit the memory limit. It will decrease to some extend.
[07:52:11] <philnate> For random inserts I guess this drop will be quite bigger than for sequential
[07:53:16] <henrykim> philnate: here is my log http://pastie.org/4030245
[07:53:46] <henrykim> I started it from 20:14:00
[07:54:07] <henrykim> the performance at this time is over 3600 TPS ( sum of all master nodes performance)
[07:54:25] <henrykim> but after 2 hour later it's performance is under 1500 TPS
[07:57:32] <henrykim> the log I pasted is middle period of decreasing.
[07:58:29] <henrykim> philnate: I will give some more messages about 1 hour later. thanks
[07:58:33] <philnate> hmm the paste shows me that this isn't a insert only scenario, else there wouldn't be some getmore entries, or am I wrong?
[08:00:07] <philnate> additonally you have somewhat many query commands as well, not sure if those are from sharding (and rebalancing)
[08:00:51] <philnate> ah ok, I see
[08:01:11] <philnate> those are different nodes...ignored the col
[08:02:41] <philnate> so whats the test1? why is this server getting such a bunch of query and update statements? Is this a mongoconf server?
[10:31:30] <ub|k> hello
[10:31:52] <ub|k> i was wondering what is the best way to express a many to many relationship in mongodb
[10:32:13] <NodeX> in what context
[10:32:23] <NodeX> and which gets queried more
[10:32:36] <ub|k> i was using a list of DBRefs in each document, but the fact that i.e. mongokit has problems with it makes me think that maybe i am doing something terribly wrong
[10:32:54] <ub|k> i have talks and speakers
[10:33:08] <ub|k> i want to know for each speaker all the talks he gave and for each talk who were the speakers
[10:33:19] <ub|k> i believe the the second gets queried the most
[10:33:43] <ron> ub|k: the best way is not to do it.
[10:33:44] <NodeX> Personaly I would store what you need to know quickly in an embedded array inside the talks
[10:34:07] <NodeX> then you can query all talks by X easily
[10:34:17] <NodeX> and you can list all speakers at each talk
[10:34:25] <ub|k> hm, is that easy to query?
[10:34:29] <ub|k> i mean, membership of an array
[10:34:37] <ub|k> can i index that in some way?
[10:34:39] <NodeX> "talks.user":"Bob"
[10:34:51] <NodeX> sorry ... like this ...
[10:35:25] <ub|k> but that's assuming a talk has only one speaker
[10:35:32] <ub|k> i want it to have several
[10:35:56] <NodeX> "talks.speaker":"bob" ... where your talks collection looks somehting like this ... {speakers : [{speaker:"bob",date:"blah"},{speaker:"john",date:"blah"}...]}
[10:36:32] <ub|k> ah i see
[10:36:35] <ub|k> thanks
[10:36:37] <ub|k> :)
[10:36:44] <NodeX> yw ;)
[11:27:04] <tonyk> do slow read queries (for example a sort on an unindexed column) block other read queries on that collection?
[11:27:33] <NodeX> only if your connection pool is exhausted
[11:27:50] <tonyk> alright
[11:49:07] <ub|k> NodeX: still about my earlier question... the problem with your approach is that i will end up having several copies of the same speaker lying around, if (s)he's got more than 1 talk
[11:49:23] <ub|k> which then may be undesirable in case i want to update speaker information etc
[11:49:25] <ub|k> :/
[11:52:28] <NodeX> err
[11:52:46] <NodeX> you store the uid and the name (because that doesn't change) and the rest in another collection
[11:58:41] <Saby> Hi
[12:01:08] <Saby> is it possible to map a formula to a field in my collection, so that the value of that field gets calculated based on the formula
[12:05:10] <kali> Saby: nope
[12:06:58] <Saby> oh ok kali, thanks
[12:07:29] <Saby> So, I will need to perform the calculation in my code. Is it possible in any way through a hack or something ?
[12:14:30] <tonyk> is count() with no arguments fast enough to be used in place of say storing count on a doc by itself?
[12:17:20] <kali> Saby: nothing i like to think about
[12:19:29] <ledy> hi
[12:20:45] <ledy> after playing with db.largecollection.find( { $or : [ { identifierX : "sha1_hash123" } , { identifierY : "md5_hash456" } , { identifierZ : "another_hash789" } ] } ) i checked the indexes that mongodb prepared automatically.
[12:21:53] <ledy> in addition to the "single" ensureIndex for the identifiers, there is a identifierX_1_identifierY_1_identifierZ_1 now and performance is down :-(
[12:22:40] <ledy> any ideas how to explain to mongodb that it could be enough to use the indexes for the single IDs because i do not have $and, but $or queries?
[12:41:09] <kali> kelye: afaik, you can force mongodb to use a given index, force it to perform a table scan, but not forbid the use of one given index
[12:41:24] <kali> ledy: ^ (sorry kelye)
[12:41:58] <kali> ledy: have you consider performing three queries instead of one and mixing the result in application land ?
[12:52:45] <ledy> kali: strange, after removing the triple-index, it has not been created again. now it is using the single indexes.
[12:53:32] <ledy> "MongoDB can only use one index at a time" => this leads to the question:
[12:53:45] <ledy> What query do you suggest me to use with the find? <i>{ $or : [ { identifierX : "sha1_hash123" } , { identifierY : "md5_hash456" } , { identifierZ : "another_hash789" } ] } </i> OR better 3 * db.find for any single identifierY/Y/Z and merging the results on my own?
[12:55:17] <kali> ledy: i don't remember how $or are treated by the optimizer... but if you perform three simple query, at least, i'm sure you'll get the right index for each query :)
[12:56:02] <kali> ledy: http://www.mongodb.org/display/DOCS/Indexing+Advice+and+FAQ#IndexingAdviceandFAQ-Oneindexperquery.
[12:56:26] <kali> ledy: ok, it looks like $or will run the three queries and merge the results
[12:57:49] <Derick> ledy: hi :-)
[12:57:56] <Derick> ledy: I just answered on Stack Overflow
[13:02:17] <Derick> ledy: and updated my answer there too
[13:07:47] <ledy> thx
[13:10:06] <ledy> i'd prefer to let mongodb do the job with one query including all three identifiers in the $or... i hope $or is the "lazy operator" for this statement. so mongodb can stop on its own after first match when i use findOne...
[13:40:21] <leecher> Hey guys, I'm trying to use find to locate all documents that do not have the filed "deleted_at" .. give me a tip on how to do that?
[13:40:56] <rexxars> leecher: http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-%24exists
[13:41:11] <leecher> pff .. thx!
[13:41:22] <rexxars> yw :)
[13:48:10] <horseT> How can I get mongodb 2.2 unstable to make my own bench ?
[13:49:23] <Derick> you can get it from github; it's the master branch
[13:50:33] <kali> there is also a nightly build for 2.1
[13:54:08] <Derick> ledy: it's pollite to accept answers on StackOverflow btw
[14:07:57] <horseT> Derick: thanks
[14:09:47] <ledy> @Derick, is it marked as answered now? new to stackoverflow...
[14:10:19] <Derick> nope, not yet
[14:10:34] <Derick> i think you just upvoted it
[14:10:56] <multiHYP> hi
[14:11:39] <Derick> ledy: http://meta.stackoverflow.com/questions/5234/how-does-accepting-an-answer-work
[14:20:55] <horseT> @horseT : plop
[15:29:28] <tonyk> is count() with no arguments O(1)?
[15:30:25] <kali> tonyk: yes. beats sql implementations, right ? :)
[15:31:10] <kali> tonyk: it may be O(n) where n is the shard number actualy, i'm not 100% sure
[15:34:20] <tonyk> I'll just use a counter inside my global stats record
[15:34:25] <tonyk> schemaless is perfect for this
[15:36:23] <tonyk> kali: I dont know why count(*) is "slow" on SQL servers, should be an easy problem to solve, right?
[15:37:43] <kali> not really, actualy, because of the transaction model
[15:38:54] <kali> tonyk: there are endless trolls in the postgresql archive, if you're interested
[15:47:23] <tonyk> my trolling appetite must be satisfied
[15:48:30] <tonyk> there should always be a fast count that isnt ACID compliant for pagination etc
[16:36:45] <ranman> hey interns
[16:36:51] <hendzen> hey bros
[18:13:44] <zykes-> hi there
[18:13:54] <zykes-> does mongodb do like master-master geo replication ?
[18:15:06] <linsys> No
[18:15:19] <linsys> zykes-: But from what I understand 2.2 will have geo aware sharding
[18:20:07] <jjbohn> back
[18:20:34] <zykes-> linsys: so you can't replicate with it across continents ?
[18:20:59] <linsys> yes you can replicate your data across continents but you asked for master-master
[18:22:17] <zykes-> linsys: so todays replication means you master > slave?
[18:22:40] <NodeX> you can tag your nodes
[18:22:48] <zykes-> meaning ?
[18:23:51] <NodeX> you can tell it where it is if that makes sense
[18:24:01] <NodeX> I am trying to find the docs on it, 1 sec
[18:24:08] <linsys> a replica set is a set of mongodb servers with one master or PRIMARY while the rest are secondaries or slaves
[18:24:20] <linsys> if the primary fails one of the secondaries takes over the function of the master
[18:24:38] <linsys> Right, but there is no master / master replication but there is sharding
[18:25:30] <NodeX> it kind of is master/master replication because they both have a full copy of the data
[18:25:31] <zykes-> sharding meaning ?
[18:25:33] <linsys> http://www.mongodb.org/display/DOCS/Data+Center+Awareness#DataCenterAwareness-Tagging%28version2.0%29
[18:25:37] <NodeX> if one fails the other one takes over
[18:25:48] <linsys> that is not master / master replication
[18:25:57] <linsys> but if thats what you call master / master then its master / master
[18:26:16] <linsys> sharding: http://www.mongodb.org/display/DOCS/Sharding+Introduction
[18:26:35] <NodeX> I didn't say it was
[18:26:41] <NodeX> I said it's kinda like it
[18:26:44] <linsys> ok
[18:26:57] <linsys> anyway you get the idea...
[18:27:20] <NodeX> http://www.mongodb.org/display/DOCS/2.0+Release+Notes#2.0ReleaseNotes-Datacenterawareness
[18:27:36] <NodeX> "You can now "tag" replica set members to indicate their location. You can use these tags to design custom write rules across data centers, racks, specific servers, or any other architecture choice."
[18:28:36] <linsys> Yes I know how mongodb works...
[18:29:01] <linsys> That is for devs to code that they want to make sure a write goes to two racks or 3 specific servers etc..
[18:29:17] <NodeX> DId I say that you didn't know how mongo works?
[18:29:45] <linsys> No, just stating that since u pasted a link. Anyway any other question
[18:30:11] <NodeX> it was not for you.. you obvisouly know how mongo works so it can't be for you ;)
[18:30:37] <NodeX> ignored
[18:30:48] <souza> Hi all
[18:35:11] <souza> guys i have a non tecnical question, what is more recommended using mongodb, i can use three schemas, and make relations among them, or create only an schema and create arrays inside this schema, what's more recommended? Thanks
[18:36:15] <zykes-> when's 2.2 due ?
[18:36:53] <NodeX> souza : it depends on your data model
[18:37:53] <souza> NodeX: i don't understand, i think that's the same think, but showed in differents ways.
[18:38:15] <NodeX> if your data model prefers relational then use relational
[18:38:26] <NodeX> if you query one thing less than another then use embedded
[18:38:43] <linsys> souza: It depends very much so on your data model... If the single json document will only grow to a certain size you might want to imbed. If you want to create a chat service and user Joe: is going to have 100s or even 1000s of imbeded conversations in a single json document its better to create a new collection called chat and reference Joe's conversations from that collection. Each conversation its own document
[18:41:54] <souza> linsys: in my case i'm using this only for tests, i want to login determinated system, and all my actions generate a log, the i'll have a "object" user and another "log"
[18:42:40] <souza> i think to this case, i can use only one schema, right?
[18:43:58] <linsys> Not sure I understand exactly.. sounds like the actions could grow and grow. If this is a test you can make it as simple as you want but if the actions are going to grow in an unlimited manner once you go to prod you might want to break them out into their own collection
[18:49:45] <souza> linsys: Ok, thanks, now i can think about and get a conclusion.
[18:56:42] <spicyWith> is anyone here using cloudformation to deploy mongo on ubuntu and ec2?
[19:03:38] <linsys> spicyWith: I am actuaing using Fabric to do ec2 deployments working on that right now
[19:04:46] <spicyWith> linsys: ah cool - used to use fabric as well. just discovered cloudformation which seems very powerful to describe infrastructures - having some trouble attaching EBS volumes to ubuntu though
[19:08:05] <nicholasdipiazza> why is mongodb so much better for documents with plenty of internal structure versus a small fixed size?
[19:09:28] <durre_> if I have two "entities" (Domain & Position). I want to find all positions for a certain domain. should I "link" to Domain with the id, or should position contain the domain name to letting me do the query? what's "the mongo way"?
[20:03:54] <fjay> anyone seen anything like this...
[20:03:55] <fjay> mongos> db.profile.find().limit(900).length()
[20:03:56] <fjay> 804
[20:03:56] <fjay> mongos> db.profile.find().length();
[20:03:56] <fjay> 2613
[20:04:07] <fjay> shouldnt the first one return 900?
[20:05:06] <dstorrs> fjay: try doing this in your mongo shell => var x = db.profile.find()
[20:05:12] <dstorrs> typeof x
[20:05:33] <dstorrs> not sure, but I suspect you'll get back a string, or a string object
[20:05:54] <fjay> object
[20:06:35] <fjay> i dumped this db and restored it into our dev env
[20:06:41] <fjay> and it doesnt do this :)
[20:06:45] <fjay> it behaves as expected
[20:07:29] <dstorrs> huh. hold one, let me check something.
[20:08:38] <dstorrs> are your two DBs running the same mongo version?
[20:08:53] <fjay> 2.0.2 and 2.0.4 (.4 being the broken one)
[20:08:54] <dstorrs> because find() returns a cursor
[20:08:57] <fjay> i think its the 4
[20:09:03] <fjay> 2.0.2 being dev
[20:09:12] <dstorrs> what driver language ?
[20:09:21] <fjay> this is just me using the mongo shell
[20:09:51] <fjay> but we see the same issue in the java driver
[20:10:17] <dstorrs> well, I think the issue is this :
[20:10:23] <dstorrs> .find() returns a cursor
[20:10:37] <dstorrs> which you then ".length" on
[20:10:42] <dstorrs> so it stringifies it
[20:11:03] <dstorrs> it so happens that the string representation of the one on the "broken" box is 900 bytes
[20:11:38] <dstorrs> whereas the other one is 2613 due to data diffs / filesystem diffs / version diffs / whatever
[20:12:13] <dstorrs> basically, I *think* this is a category error
[20:12:20] <fjay> so.. how would i count the elements?
[20:12:24] <fjay> rather than using length?
[20:12:48] <fjay> to do paging ?
[20:13:27] <fjay> or is it not that.. as much as its just length() being broken
[20:13:35] <fjay> 'broken' for this use case
[20:14:09] <dstorrs> I very much doubt that length is 'broken', except perhaps in the sense that sharks are broken as paintbrushes :>
[20:14:31] <fjay> i should sya.. wrong tool for the job
[20:14:34] <fjay> say
[20:15:11] <dstorrs> so, clarify for me what exactly you're trying to do? page through a collection <=900 elements at a time?
[20:15:23] <dstorrs> and what are you going to do with the data on the other end?
[20:16:22] <fjay> well.. i had someone pass this on to me as being 'broken' :)
[20:16:40] <fjay> but i can see how the find() might cause issues vs. just having an array
[20:18:28] <dstorrs> well, if you just want to page data, find() gives you a cursor
[20:18:57] <fjay> we are trying to page w/ offsets for pagination..
[20:19:04] <dstorrs> that's all the paging control you'll ever need. just store the '900' (or whatever) in an env var / config / code etc
[20:19:11] <fjay> yeah
[20:19:23] <fjay> writing a quick test script now
[20:19:32] <fjay> in ruby to prove/disprove its a mongo issue
[20:20:56] <dstorrs> speaking of configs / env vars, has anyone else seen this? http://www.12factor.net/
[20:21:01] <ayakushev> hey, i'm working with fjay oh this
[20:21:17] <ayakushev> and we can't just use a cursor for paging
[20:21:31] <dstorrs> ayakushev: why not?
[20:22:11] <ayakushev> the requests are stateless, and come from elsewhere
[20:22:49] <dstorrs> how can you have pagination without state being stored somewhere?
[20:23:07] <ayakushev> use skip/limit
[20:23:18] <dstorrs> which stores the state in the URL
[20:23:25] <dstorrs> or the session
[20:23:27] <dstorrs> or ...
[20:23:36] <dstorrs> so why won't those things work here?
[20:25:03] <dstorrs> (the '...' should be read as'or the database handle, which must be stored in DB and then referenced by state stored in the....)
[20:25:42] <ayakushev> yes, we can store some state
[20:25:53] <ayakushev> that's not even the point
[20:26:00] <ayakushev> what if we just want to return the first 900
[20:26:12] <ayakushev> mongos> db.profile.find().limit(900).length()
[20:26:12] <ayakushev> 804
[20:26:19] <ayakushev> ^ that sucks
[20:26:31] <dstorrs> what are you trying to return?
[20:26:46] <dstorrs> because that does not return the first 900 results, nor should it.
[20:26:58] <ayakushev> why not?
[20:27:50] <dstorrs> db.COLL.find(NNN).limit(MMM) means "create a cursor in collection COLL that is restricted to at most 900 rows and then will return 'undef'"
[20:28:04] <dstorrs> (or whatever your driver's equivalent of undef is)
[20:28:45] <dstorrs> I don't know why you keep trying to apply .length() to it. I can't even tell what you think it's going to do, which is why I'm having trouble helping.
[20:29:15] <dstorrs> step back and lay out for me exactly what you're trying to achieve. I'll help if I can.
[20:29:23] <ayakushev> ok, i'll get the results, from it
[20:29:25] <ayakushev> db.profile.find().limit(900).toArray().length
[20:29:36] <ayakushev> 804
[20:29:39] <ayakushev> same result
[20:30:25] <dstorrs> ayakushev: I'll say it one more time, then I give up. Please step back and lay out for me exactly what you're trying to achieve. I'll help if I can.
[20:31:16] <ayakushev> i'm trying to get the first 900 results from a collection
[20:31:53] <dstorrs> ok. db.profile.find().limit(900) returns a cursor which does that.
[20:32:19] <dstorrs> you can then repeatedly call "doc = cursor.next()"
[20:33:56] <dstorrs> does this fit your use case?
[20:34:17] <ayakushev> yeah, but it stops at 804
[20:35:49] <dstorrs> you and fjay have mentioned the number '804' twice now. The first time was when he said you were doing this: db.profile.find().limit(900).length()
[20:36:01] <fjay> yeah.. and if i do this.. in ruby...
[20:36:09] <fjay> count = 0
[20:36:09] <fjay> profile.find({},{:limit => 900}).each do |flarg|
[20:36:09] <fjay> count = count +1
[20:36:09] <fjay> end
[20:36:12] <fjay> puts count
[20:36:14] <fjay> it prints 804
[20:36:21] <dstorrs> you then came in and said you were doing this: db.profile.find().limit(900).toArray().length
[20:36:30] <dstorrs> first off, which one are you guys doing?
[20:36:37] <fjay> all of them :)
[20:36:39] <fjay> they are all broken ;)
[20:36:49] <dstorrs> ok.
[20:36:54] <fjay> something is making the cursor only return 804 results
[20:37:36] <dstorrs> just to eliminate the obvious -- in the mongo shell, do db.profiles.count() to verify record total
[20:37:51] <dstorrs> and ensure that you're on the right box when you do it.
[20:37:56] <fjay> mongos> db.profile.count()
[20:37:56] <fjay> 2613
[20:37:57] <fjay> mongos>
[20:38:14] <dstorrs> mongos...hm. Are you using a sharded collection?
[20:38:19] <fjay> yup
[20:38:27] <dstorrs> aha.
[20:38:45] <fjay> but... on another box we are using sharding.. and it doesnt show the issue ;) perhaps from sharding after the fact
[20:38:47] <dstorrs> Ok, I'm beyond my knowledge here, so I suggest throwing a ticket at JIRA
[20:39:10] <dstorrs> but I suspect that what's happened is that on one box there are actually >900 docs resident locally
[20:39:22] <fjay> its all living on the same shard ;)
[20:39:38] <dstorrs> and on the other there are not. That would be my guess, anyway....
[20:39:47] <fjay> nope.. both are single shard
[20:39:54] <fjay> err rather.. both are living on the single shard
[20:39:55] <dstorrs> Curses, another brilliant theory foiled by an ugly fact.
[20:40:56] <dstorrs> 804...804...
[20:41:01] <dstorrs> that's a weird number.
[20:41:18] <ayakushev> well
[20:41:21] <dstorrs> it's not a power of two, it's not a file permission.
[20:41:24] <ayakushev> mongos> db.profile.find().limit(850).length()
[20:41:24] <ayakushev> 850
[20:41:28] <ayakushev> mongos> db.profile.find().limit(855).length()
[20:41:29] <ayakushev> 849
[20:41:36] <ayakushev> mongos> db.profile.find().limit(860).length()
[20:41:36] <ayakushev> 844
[20:41:44] <ayakushev> think it has something to do with a particular record
[20:41:57] <dstorrs> yeah
[20:42:09] <dstorrs> what does one document from your collection look like?
[20:42:15] <ayakushev> but, as fjay mentioned, copying the whole db to a different place fixes the world
[20:42:38] <fjay> can't really share doc's from the collection w/o sanitization
[20:42:49] <dstorrs> ok, I'm stumped. :<
[20:43:19] <dstorrs> sorry guys, I got nothin'. All I can say is try to narrow it to that one record and then throw it in JIRA so it can be fixed in next release.
[20:43:29] <dstorrs> please do be sure to post the ticket, though.
[20:45:21] <ayakushev> ok, thanks for your help
[21:31:36] <nicholasdipiazza> If I have doc = {"inner":[{"innerId":1, "name":"nick"}, {"innerId":2, "name":"fred"}, ..., {"innerId":100, "name":"jeff"}, ]} How can I update everyone's name to "Todd"
[21:33:10] <skot> you cannot in a single operation
[21:33:50] <skot> see the docs here: http://www.mongodb.org/display/DOCS/Updating
[21:34:54] <nicholasdipiazza> ok. so I would have to write some sort of loop?
[21:36:12] <nicholasdipiazza> i am having trouble knowing what operations i can perform on the document variables themselves
[21:36:54] <dstorrs> nicholasdipiazza: a document is just a JSON struct.
[21:37:21] <nicholasdipiazza> d = {'myId':'value'}
[21:37:41] <nicholasdipiazza> that's a json struct where d.myId = value
[21:38:32] <dstorrs> nicholasdipiazza: as to the updating...you could write a loop, you could map/reduce, you could do it in app code, you could replace all of those records with a single one -- it all depends on what you're trying to achieve
[21:38:52] <nicholasdipiazza> let's say i have d = {"myid":"thisismyid", "values":["1", "2", "3"]}
[21:39:10] <nicholasdipiazza> how can i update that document (without using db.save or update) form mongo console
[21:39:56] <dstorrs> erm...why are db.save and db.update verboten? If you're trying to change data from the console, that's kinda how you do it
[21:40:09] <nicholasdipiazza> oh ok
[21:40:35] <dstorrs> also, step back farther. what does this collection represent? why are you using embedded docs? why the update?
[21:40:37] <nicholasdipiazza> i'm used to treating a JSON object in Javascript... where if I loaded an object with an array of strings, i could loop through and update those strings with a for loop
[21:41:12] <nicholasdipiazza> This is all based on an interview question
[21:41:32] <dstorrs> you can do the exact same thing in mongo. The diff is that if you want the data to persist, you need to save it back to disk
[21:41:46] <dstorrs> s/data/changed data/
[21:41:58] <nicholasdipiazza> can you give me an example code snippet on pastebin?
[21:42:07] <nicholasdipiazza> i'm stomped how to do that sort of stuff after getting an object with find
[21:42:28] <nicholasdipiazza> I've found my object... how I want to update a bunch of the array values stored in that object, then update it
[21:42:33] <dstorrs> you're applying for a job and this was one of the questions?
[21:42:42] <nicholasdipiazza> i'll give you the exact question
[21:42:47] <tystr> heh
[21:42:58] <nicholasdipiazza> yeah and it left me scratching my head
[21:43:35] <dstorrs> nicholasdipiazza: I'll save you some time -- if someone else wants to help you with this, more power to both of you. But I'm not comfortable doing someone else's homework, so I'm going to bow out.
[21:43:46] <dstorrs> I hope you find the answer and get the job, though.
[21:44:46] <nicholasdipiazza> oh it's not like that
[21:44:51] <nicholasdipiazza> the interview is over i'm just wondering
[21:45:08] <dstorrs> suddenly new information appears. :>
[21:45:11] <nicholasdipiazza> so feel free to look at it if you want. http://pastebin.com/5nnqagPB
[21:46:04] <dstorrs> and you have to do this in the shell, or can you use app code?
[21:46:07] <nicholasdipiazza> shell
[21:46:34] <nicholasdipiazza> had to write it on a whiteboard i was like... nope no idea lol
[21:47:35] <dstorrs> nicholasdipiazza: good for you for finding the answer afterwards. I'd start looking here: http://docs.mongodb.org/manual/reference/javascript/
[21:48:15] <dstorrs> you may also want to look at this: http://docs.mongodb.org/manual/reference/javascript/#cursor-methods
[21:48:38] <nicholasdipiazza> cool thanks. anyone else want a head scratcher give that guy a look
[21:48:55] <nicholasdipiazza> http://pastebin.com/5nnqagPB
[21:49:21] <dstorrs> at each step ask yourself "if I were doing this in a JS script, how would I do it?" then back that up until you've turned the mongo environment into essentially the same script
[21:49:22] <nicholasdipiazza> why are these api docs not easier to get online? i was looking for something like this with no luck
[21:49:51] <nicholasdipiazza> oh it's because i wasn't looking in the javascript reference
[21:49:52] <nicholasdipiazza> i see
[21:53:23] <dstorrs> nicholasdipiazza: as to finding the docs, try this: http://lmgtfy.com/?q=mongo+shell ;>
[21:53:33] <dstorrs> that's all I did to find the links I gave you
[21:55:37] <ramsey> Derick: ping
[21:55:41] <nicholasdipiazza> cool!
[21:55:41] <nicholasdipiazza> lol
[21:56:28] <dstorrs> nicholasdipiazza: if you're at all interested in Mongo, I STRONGLY suggest setting aside 3-4 hours and reading all of docs under http://docs.mongodb.org/manual/
[21:57:03] <modcure> lets say i fire up mongodb. I run a query(get documents from a collection). mongodb would need to fetch this from vritual memory(on disk but faster than non virtual memory). this would cause a page fault since its the first time the data is being access?
[21:57:03] <nicholasdipiazza> I did. i feel like they get you part of the way with simple scenarios
[21:57:11] <dstorrs> ramsey: you know, I've never seen any of Derick kchodoro_ or tychoish actually active on this channel. have you?
[21:57:25] <ramsey> dstorrs: I have
[21:58:07] <dstorrs> or hey, I just realized. kchodorow must be Kristina Chodorow -- shiny! we are the in presence of greatness.
[21:59:00] <eph3meral> so I'm basically absolutely new to mongo querying, I've worked a bit with Mongoid and I did the basics of this like, last year... plenty familiar with SQL and JavaScript themselves... so anyway...
[21:59:07] <ramsey> dstorrs: indeed we are :-)
[21:59:10] <eph3meral> i've come across this issue here while reading http://www.mongodb.org/display/DOCS/Updating#Updating-%24set
[21:59:11] <dstorrs> modcure: they typically suggest that you keep your working set small enough to fit in RAM
[21:59:16] <eph3meral> set doesn't seem to want to add a field
[21:59:19] <eph3meral> is that true?
[21:59:29] <eph3meral> can I not add new fields to a document using update and set?
[21:59:35] <eph3meral> the document already exists
[21:59:58] <eph3meral> this is my query
[22:00:01] <eph3meral> db.dashboards.update( {_id: "4fce7c96b29a7b141c000001"}, { $set : { creator_id : "4fce7b10b29a7b12c7000001" } } )
[22:00:13] <dstorrs> eph3meral: step back for a second and tell us what you're trying to achieve at a high level
[22:00:25] <dstorrs> (I really need to make a macro for that sentence)
[22:00:30] <eph3meral> dstorrs, uhh... I'm trying to add a field to this document
[22:00:35] <modcure> dstorrs, when mongodb loads. it maps the data files to virtual memory(on disk but faster than regular block on disk). then i run a query . mongodb would need to fetch the data from virtual memory into memory? this would be consider a page fault?
[22:00:37] <eph3meral> the document already exists
[22:00:55] <eph3meral> dstorrs, i know what you mean, but there is not much higher of a level than this
[22:01:04] <eph3meral> i want to add a field to an already existing document
[22:01:21] <kchodoro_> dstorrs: :-D
[22:01:24] <dstorrs> eph3meral: yes. why? what is the collection? why are you updating it?
[22:01:52] <eph3meral> dstorrs, it's unicorns, and it's for fun, because I'm curious and I want to know how to do this
[22:02:24] <eph3meral> i'm thirsty for knowlege :)
[22:03:32] <dstorrs> modcure: I believe it would, yes.
[22:03:42] <dstorrs> eph3meral: heh. fair enough.
[22:04:07] <kchodoro_> eph3meral: that looks correct, what's happening?
[22:04:34] <dstorrs> kchodoro_: I write in Perl and have been using your MongoDB driver. It is great -- thank you.
[22:04:50] <kchodoro_> you're welcome! glad it's working for you
[22:04:53] <modcure> dstorrs, since mongodb maps the files to virtual memory on disk. this would mean I would need double space for my data? one for disk and one for virtual disk?
[22:05:38] <eph3meral> kchodoro_, essentially "nothing" i get no response from the mongo shell at all, i hit enter, it goes back to the prompt, and when I do a subsequent db.dashboards.find() the data is not there
[22:05:46] <eph3meral> the data is not updated, nor is it added
[22:05:59] <dstorrs> modcure: dude, you're in a channel with kchodoro_ answering questions! don't ask me, ask her. :>
[22:06:28] <modcure> new to mongodb, I don't know who is who :)
[22:06:41] <dstorrs> (did you notice how nicely I avoided saying "I don't know?" :>)
[22:06:49] <kchodoro_> dstorrs: i appreciate the enthusiasm, but feel free to help out :)
[22:06:51] <modcure> :)
[22:07:01] <modcure> kchodoro_, answer my question please :)
[22:07:44] <kchodoro_> eph3meral: can you pastebin what do you get doing db.dashboards.findOne({_id: "4fce7c96b29a7b141c000001"})?
[22:07:53] <eph3meral> yep
[22:08:00] <kchodoro_> modcure: er... what's your question? i just got here
[22:08:11] <eph3meral> ahh, yeah it's null
[22:08:18] <eph3meral> hmm, weird
[22:08:32] <kchodoro_> eph3meral: you sure you don't mean ObjectId("4f...")
[22:08:33] <kchodoro_> ?
[22:08:38] <eph3meral> I am not sure :)
[22:08:43] <modcure> kchodoro_, when mongodb loads. it maps the data files to virtual memory(on disk but faster than regular block on disk). then i run a query . mongodb would need to fetch the data from virtual memory into memory? this would be consider a page fault?
[22:08:45] <eph3meral> let me try that
[22:08:51] <kchodoro_> eph3meral: ObjectId != string
[22:09:36] <kchodoro_> modcure: i think so
[22:10:14] <eph3meral> yep, i was looking in to just that issue, didn't know what the syntax was tho, so thanks
[22:10:26] <kchodoro_> np
[22:11:05] <modcure> kchodoro_, since mongodb maps the files to virtual memory on disk. this would mean I would need double space for my data? one for disk and one for virtual disk?
[22:11:20] <dstorrs> kchodoro_: are all ObjectIDs with the same hash considered == ? And are they considered === ?
[22:12:19] <tychoish> ObjectIds aren't a hash.
[22:13:02] <dstorrs> ObjectID("4fce7c96b29a7b141c000001") <=== what is the "4f..." properly called?
[22:14:19] <kchodoro_> dstorrs: i guess the value of the object id
[22:14:26] <kchodoro_> generally just the object id
[22:15:28] <kchodoro_> dstorrs: looks like they're not even ==
[22:15:39] <kchodoro_> which seems silly, since the db matches them
[22:15:51] <kchodoro_> that would probably be considered a bug in the shell
[22:16:42] <dstorrs> post a JIRA on that? you want to or should I?
[22:20:47] <modcure> kchodoro_, since mongodb maps the files to virtual memory on disk. this would mean I would need double space for my data? one for disk and one for virtual disk?
[22:21:16] <kchodoro_> dstorrs: go ahead
[22:22:08] <kchodoro_> modcure: nope, just once... although journaling can mess with that, it can map everything a second time, i think
[22:22:57] <modcure> kchodoro_, im lost. the data in on disk... mongodb loads the data files to virtual memory(this is in disk too). wouldnt this be double the space?
[22:26:35] <kchodoro_> modcure: virtual memory isn't disk, it's an abstraction the OS uses
[22:26:44] <kchodoro_> modcure: maybe you're thinking swap?
[22:27:18] <modcure> kchodoro_, i need to read up on virtual memory
[22:42:15] <timebomb> heya, say i have a company with many locations, should these be embedded via embeddeddocuments? and will i then still be able to make use of the spatial indexing?
[23:05:21] <dstorrs> kchodoro_: (back. had to go interview someone) Bug report is here: https://jira.mongodb.org/browse/SERVER-6010
[23:05:52] <kchodoro_> dstorrs: cool, thanks!
[23:06:09] <dstorrs> np. Note that I used your nick in it, if you care.
[23:06:30] <dstorrs> some people get bugged about quoting server logs, so figured I'd mention it.
[23:13:09] <fjay> dstorrs: looks like its a bug in mongos after doing some digging
[23:13:42] <dstorrs> fjay: where 'mongos' means 'the shell' or 'the shard manager' ?
[23:13:48] <fjay> the shard manager
[23:13:58] <fjay> mongo router
[23:14:06] <fjay> talking to the shard directly.. it works as advertised
[23:14:23] <dstorrs> wow, that's bad.
[23:16:23] <dstorrs> kchodorow: fjay is doing something like this: db.profiles.find().limit(900) and getting only ~800 records back from a >2000 record sharded collection. Is there any reason you can think of for this, or is this a "file a bug" thing?
[23:18:04] <dstorrs> fjay: you might start queuing up that bug report on jira (see channel title).
[23:18:35] <dstorrs> tychoish: do you know? ==^
[23:19:00] <dstorrs> fjay: the channel gods might be able to explain it but yeah, sounds like a bug
[23:22:25] <fjay> thats the plan as its a bug we see in prod but
[23:22:33] <fjay> for now i am going to cook dinner :)
[23:23:15] <dstorrs> enjoy :>