[02:02:02] <donCams> hi. how do i create a query to find several documents with "where", sorting, and pagination using spring data mongorepository? i can't find any example
[11:40:26] <unsleep> i want to do a statistic but i think that put every visit inside asecond level is will be very bad thing. isnt it?
[11:41:47] <Derick> yes, as that makes the size of each document grow an grow and grow - and that means MongoDB will need to move it around on disk a lot
[11:41:58] <Derick> it's better to store each visit in its own document in a different collection
[11:42:24] <kali> and you're in risk of hitting the 16MB limit at some point
[11:43:06] <unsleep> so i was a bit right about it hihihihi
[11:44:56] <unsleep> i dont want to need to do a map reduce to process the info and show the stats so i must to think it very carefully
[11:45:26] <Derick> unsleep: instead of Map Reduce, you should most likely use the Aggregation Framework anyway
[11:46:00] <kali> unsleep: that will not completely solve the "growing document" issue
[11:47:20] <unsleep> sure, having a million visits an hour would be a big problem ¬_¬
[11:48:00] <Derick> unsleep: you can either do pre-aggregation (ie, just store the total, and not each visit) or really just have one document in a collection per visit and use A/F
[11:49:02] <unsleep> i was thinking about that.. "preprocess" the info with php and destress the db process
[11:49:45] <unsleep> i think taht use the db like a "onthefly calculator" is a big mistake
[11:51:19] <unsleep> i could do two colections but that is very similar to do one and then a map reduce
[11:57:34] <unsleep> having in count tha 15mb limit i dont see other way to do it than store everything and then do an aggregation or map reduce
[12:02:18] <unsleep> or simply forget to store who done what when.... like you said
[12:17:57] <unsleep> do you recommend to use $inc?
[13:03:59] <dreamchaser> hi. i want to export all document created between saturday and sunday to an csv file. while i use the mongoexport, the query part always have some errors.
[14:16:13] <hotsnow> i defined the data structure like perl hash, so i make a mistake
[14:18:31] <hotsnow> It doesn't matter, thanks everybody
[14:18:47] <bobinator60> if I have an array inside my document, can I have two $elemMatch clauses to have them match two different elements of the array?eg values: $elemMatch{type:distance, value:50}, values: $elemMatch{type:color, value: blue}
[19:37:32] <crudson> re. n06's comments: as long as you are not passing input directly to $where, db.eval() or map reduce function strings you can't just pass user inputted javascript into a query like that.
[19:38:11] <n06> crudson, can or can't? I can't tell if you are agreeing with me or not
[19:38:28] <n06> do only $where and db.eval() execute code?
[19:38:52] <crudson> I am generally disagreeing :) http://docs.mongodb.org/manual/faq/developers/#how-does-mongodb-address-sql-or-query-injection
[19:39:42] <n06> crudson, interesting, thanks for the heads up
[19:39:56] <n06> i would still caution not sanitizing user input. Never trust your users :)
[19:41:57] <crudson> n06: no probs. There are situations where you have to be careful, but it's a different deal than with SQL strings. A string containing javascript doesn't magically become query json unless using eval/where, as detailed above.
[19:42:56] <kali> using $where or eval() (or even map reduce) is a bad idea anyway
[19:43:35] <kali> (and that the effort to prevent injection becomes minimal)
[19:44:09] <n06> kali, mapreduce is a super useful function, but i have only used it an a user-isolated case.. its never anywhere near an endpoint
[19:45:15] <n06> crudson, absolutely, i understand now. My point from before was simply that keeping good security practices is always a good idea. Never trust anything or anyone but yourself haah
[19:48:15] <kali> n06: map reduce is not so usefull since the aggregation framework has been introduced :)
[19:53:05] <crudson> kali: eval is not so bad as before with 4.2, with less strict locks and multi-threading. Map reduce is very useful. Try doing a .aggregate on a big collection, or anything more complex than the exposed functions provide.
[21:54:52] <Bartzy> I have a comments collection and a photos collection. Each commend has a "photo_id". I want to do a map reduce that gets the number of comments for each photo, but something like gaussian function
[21:55:21] <Bartzy> so essentially I want to see what is the count of comments for -most- of the photos.
[22:06:27] <Bartzy> Have no idea how to normalize the data though. I now got that I need to get data that looks like: 1 comment -> 20 photos, 2 comments -> 35 photos, 3 comments -> 400 photos
[22:18:24] <Bartzy> it just seems so much faster, maybe my map reduce was bad. Can aggregate write results to a collection ?
[22:18:43] <paulkon> for each photo in the photos collection return the number of comment docs and send that in an array to the client and render that data with http://www.jstat.org/
[22:18:49] <paulkon> if I understand you correctly
[22:19:01] <paulkon> unless you need that data on the server beforehand
[22:19:35] <crudson> Bartzy: to filter the input documents use $match in aggregation, or 'query' option in mapreduce
[22:20:05] <Bartzy> crudson: I meant write the result of the aggregation to a collection instead of seeing it on the shell
[22:32:18] <Bartzy> paulkon: map reduce is code, you can do what ever you need
[23:00:21] <orngchkn> How can I disable the warning "warning: ClientCursor::yield can't unlock b/c of recursive lock ns"? It's flooding my logs (writing about a meg per second)…
[23:01:31] <orngchkn> As I understand it, findAndModify is the "culprit" and They say that it's an error that can be ignored. But how can I get mongo to stop printing it altogether so I don't have to babysit the log filesystem
[23:01:57] <orngchkn> I keep getting degraded mongo perf when the log filesystem hits 100% usage (even though I have logrotate running hourly…)
[23:06:36] <paulkon> I guess that's expected behavior
[23:36:25] <orngchkn> paulkon: It says in that thread that a good index will usually help the issue. What's the best way to figure out if the indexes that I have in place are insufficient?
[23:36:51] <orngchkn> (I'm using Delayed::Job with Mongoid which ought to have pretty good indices)
[23:38:57] <crudson> orngchkn: you can run with --notablescan if you want it to error on non-indexed queries
[23:39:35] <crudson> that's the strict way to ensure indexes are there, but that is not the same as them being "appropriate" or "optimal"
[23:50:34] <kurtis> Hey guys; I'm working with millions of Documents. During a user's "session", they typically only deal with a subset of these tuples. The largest I've seen so far is 13 million+. I'm running multiple queries on these sets of Documents during a user's session. Would it be smart to cache the ObjectID to all of these Documents using Redis (or something similar?) for quicker queries? Or, is there a better mechanism for caching the set of Documents for all of the