[00:01:20] <GitGud> hey. I wanted to know the standard way to do a particular thing. i will be having a database of posts made by a bunch of users that need to be sorted at all times chronologically. and there will be a query from the front page of my webpage that lists the 4 most recent user posts. now my question is what is the most efficient way to sort them and index them and preserve the sorting in the index. and then making sure the query for the first 4 posts
[00:01:20] <GitGud> uses the date sorted index and returns the first 4 earliest post objects?
[01:11:05] <GitGud> hey. I wanted to know the standard way to do a particular thing. i will be having a database of posts made by a bunch of users that need to be sorted at all times chronologically. and there will be a query from the front page of my webpage that lists the 4 most recent user posts. now my question is what is the most efficient way to sort them and index them and preserve the sorting in the index. and then making sure the query for the first 4 posts
[01:11:05] <GitGud> uses the date sorted index and returns the first 4 earliest post objects?
[03:09:21] <GitGud> is it more efficient to search query a unique field than a non unique one?
[03:09:53] <cheeser> either one should be an index lookup
[03:11:25] <GitGud> is it possible to index in a list by chronological order? and preserve that so that when the next query is done it would return results by most recent?
[03:12:01] <cheeser> index on date ascending, sure.
[03:12:16] <cheeser> though if you use ObjectId for your _id you get that for free
[03:13:48] <GitGud> I'm not sure what you mean. I will be having a huge list of posts by users and they will have to be presented like first 4 most recent posts etc. So they would have to be indexed(for efficiency) in a chronological order, like when that post was made
[03:14:20] <GitGud> which is why i was wondering if indexing by time/date can be preserved so that when i do a findOne or a find on the db. it defaults to returning the most recent
[03:17:16] <cheeser> if you have a date field, index it.
[03:17:38] <cheeser> sort by that field descending, limit 4. done.
[03:17:56] <GitGud> ah okay. thats what i thought. just wasnt sure if doing that would be utilizing the index or not
[03:18:18] <GitGud> would adding new data to a collection that is already indexed add that new data to the index?
[03:55:54] <beemry> cuz I'm getting an error: 'NameError: uninitialized constant Foo::ObjectId' when I try to use it
[03:57:25] <beemry> apologies, the error is `uninitialized constant Foo::BSON`
[03:59:16] <beemry> the class includes Mongoid::Document, declares a field with type: BSON::ObjectId, throws a NameError when I try to instantiate an object.
[04:39:00] <dddh> mongoose is teh js ORM for l33t h4x0rz?
[08:52:40] <xhip> hi guys, can anyone explain me how can I make a filter with pages (i mean that I can select like 0-25 then 25-50...
[09:06:57] <thirdy> my query looks like this: db.collectionA.find().limit(1000).forEach( function( a ) { var data = db.collectionB.findOne( { _id : a.fkToB } ).data } ) -- is there a faster way?
[09:12:33] <thirdy> another quesion, when you run "mongo" on your local machine that is gonna connect to a remote mongo server, is it gonna be restricted to the bandwidth between my machine and the remote server?
[09:41:10] <thirdy> ok, the answer to my 2nd question is, a big YES. which is also the anwser to my 1st question
[09:49:18] <dddh> thirdy: you want to run it on server side ?
[09:55:34] <thirdy> dddh: yep, I did. I ssh into our mongo ec2 server, and my mongo script is now so much faster
[09:56:08] <thirdy> apparently, it works like in sql client, 1 query is 1 roundtrip to the server
[10:01:44] <dddh> you can save javascript function on the server
[12:30:09] <cheeser> fresh off the assembly line, too.
[13:08:41] <Kosch> hey guys. With mongodb 3.0.7 on linux I get this crash from time to time: http://nopaste.linux-dev.org/?845144 Do you have a clue what causes this?
[13:19:25] <deathanchor> Kosch: you running a map and reduce when that happens?
[13:22:27] <Kosch> deathanchor: Can't say this, regarding the log, the last operations were some "CMD: drop 0.tmp.mr.activities_11" just before "Invalid access at address: 0"
[13:22:58] <Kosch> (I'm not using mongodb, I'm just running it as ops)
[13:24:19] <Kosch> the other place, where the crash happened was just after "#011 building index using bulk method" ... build index done. scanned 2 total records. 0 secs ... Invalid access at address: 0...
[13:50:34] <procton> Hi, I have a question regarding tailable cursors (MongoDB 3.0.7). I have set $maxTimeMS globally in the client. Does that affect the tailable cursor too?
[13:52:20] <deathanchor> Kosch: I suggest taking some of the free online mongo dba courses or hand off the problem to someone who does the db manageent.
[13:53:12] <Kosch> deathanchor: the guy who has done a dba course moved this one to me :)
[13:53:35] <Kosch> seems the dba course is not very effective for these errors :-D
[13:54:43] <deathanchor> from my guess at the errors, you are doing some mongo operations which exceed the memory limits of the system causing the crash.
[13:56:55] <Kosch> that was my first thought too as I read something about memory in this stacktrace. I checked the monitoring, and I couldn't see any specific increase. I'd expect in these cases some kind of clear outofmemory exception.
[13:57:43] <Kosch> 2.x I used before, but lack of ssl config options and crashes during secondary replication motivated us to switch to 3.0 :-/
[13:59:47] <deathanchor> Kosch, was this happening on your secondary or primary?
[14:00:36] <deathanchor> so it must be some wacky read operation, again if you know what op caused it, you could log a jira about how to replicate the issue
[14:01:47] <Kosch> afaik this error was by someone else...
[14:02:53] <Kosch> however, I'll assign this problem back to our devs, they have to aggregate more details about the problem mentioned above.
[14:05:59] <procton> cheeser: The thing is, I get a "operation exceeded time limit" on the tailable cursor in production and test environment. I can not reproduce that in local single instance installation, though.
[14:06:23] <procton> cheeser: ... an "operation exceeded time limit" exception...
[14:11:59] <Kosch> deathanchor: thanks anyway so far :)
[14:47:15] <cheeser> procton: so if i'm understanding everything correctly, 3.2+ will honor maxTimeMS on tailable cursors but prior versions do not.
[15:16:01] <procton> cheeser: Thanks. I will have to double check the mongodb version in prod/test.
[18:24:13] <jilocasin0> PI paywalls, they don't provide links to and substantive information (such as white papers) without entering your personal information. Pretty much every link on this page.
[18:25:16] <jilocasin0> and there's nothing like this: https://www.jetbrains.com/idea/features/editions_comparison_matrix.html
[18:26:39] <jilocasin0> during a recent mongodb webinar I asked what the differnt versions where and what in each. I was directed to pages with PI paywalls.
[18:27:13] <jilocasin0> when I asked for anything specific, I was told that anything _not_ listed on one of the commercial products was in the community version.
[18:29:09] <evil_andy> I'm trying to figure out why an aggregation is going slowly. I have 14mil records, and each one has a random number(not necessarily unique) I have an index over that field and I'm trying to aggregate subsets of the collection by using $match and applying bounds to rand.
[18:29:13] <jilocasin0> cheeser: I just finished attending that.
[18:29:41] <cheeser> well, it lists a bunch of stuff on the post itself.
[18:29:50] <jilocasin0> No links to actual information, just lots of links to various PI paywalls.
[18:29:55] <evil_andy> The problem is, even with bounds of 0.01 to 0.02, which shouldn't be more than 100K records or so, it still takes 20+ minutes to run the aggregation
[18:32:16] <evil_andy> adding explain to the aggregation pipeline seems to show it's using the index, but I am not sure if I'm doing something wrong. Ihttp://pastebin.com/ExACr8HQ is the aggregation I'm trying to run
[18:32:22] <jilocasin0> release information -> reenter
[18:34:45] <cheeser> 10gen was renamed 2 years ago
[18:34:53] <StephenLynx> but they are the same people?
[18:35:15] <cheeser> for some value of the "same." it's a company. people come and go.
[18:36:13] <jilocasin0> postgresql, sqlite, mysql, Microsoft SQL Server, Hadoop, everyone else here's the basics, here's some more, here's the advanced info. What us to talk with you, contact us here.
[18:36:27] <jilocasin0> mongodb PI paywalls everywhere..... :(
[18:36:31] <Derick> I just tested, I don't have to reenter any information.
[18:37:23] <Derick> But seriously, these are the whitepapers, focussed on executives. The docs have all the info too. For example: https://docs.mongodb.org/manual/release-notes/3.2/
[18:37:41] <jilocasin0> Derick: the release notes assume that you are already using mongodb
[18:38:25] <jilocasin0> I can't even find out what versions of mongodb even exist.
[18:38:28] <Derick> jilocasin0: what info are you after?
[18:39:11] <jilocasin0> What is mongodb, what versions community, commercial0, commercial1, commercialN exist. What are the differences
[18:43:14] <jilocasin0> Uggg... PI paywall infection even here. LInks to white paper on .org site leads to PI paywall on .com site. :(
[18:45:22] <jilocasin0> Is there a link (preferable _not_ PI paywalled) that lists the various versions of mongodb, commercial or community, and what features they have?
[18:46:01] <jilocasin0> deathanchor: PI paywall == a form that attempts to extract personal information as the price for showing you information
[18:47:52] <deathanchor> https://www.mongodb.org/downloads#production and https://docs.mongodb.org/manual/release-notes/
[18:47:55] <jilocasin0> when I see a link the says "Download the White Paper" that's what I expect, a PDF, maybe (shudder) a .DOCX, or a zip file
[18:49:08] <deathanchor> or perhaps this is what you want specific for 3.2: http://blog.mongodb.org/post/132609328153/announcing-mongodb-32
[18:49:55] <deathanchor> leads to fancy white-paper-ish: https://www.mongodb.com/mongodb-3.2?_ga=1.66475246.1716032184.1436810646
[18:50:19] <jilocasin0> deathanchor: nope. download and run free for 30 days, download packages, notes on what's changed between versions.
[18:50:57] <jilocasin0> other links give high level marketing fluff with prominent links to download the white paper, or datasheet which all lead to PI paywalls.
[18:51:01] <Derick> jilocasin0: going to ask some people where a comparison matrix is - but, in general, all features (besides enterprise integration stuff) is in the community version
[18:51:41] <jilocasin0> Derick: which would be nice, if I already knew what _all_features_ were.
[18:52:04] <Derick> it's a database, it does queries...
[18:52:09] <Derick> not sure what you're asking now
[18:52:21] <deathanchor> jilocasin0: I just dl'ed the whitepaper by putting in fake info
[18:52:27] <jilocasin0> Derick: then why would I use it instead of postgress, or hadoop, or sqlite?
[18:53:29] <deathanchor> free link to the pdf without PI paywall: https://webassets.mongodb.com/mongodb_whats_new_3.2.pdf
[18:55:18] <jilocasin0> deathanchor: thanks. Browsing through it, I read the section on the new MongoDB Encrypted Storage Engine
[18:55:57] <jilocasin0> what versions of mongodb is this in (I know because I asked during the webinar)? The text doesn't say.
[18:56:12] <Derick> deathanchor: marketing tools are quite good weeding them out though
[18:56:26] <Derick> jilocasin0: the URL tells you: mongodb_whats_new_3.2.pdf
[18:56:31] <jilocasin0> actually it does, at the bottom last sentence.
[18:56:52] <jilocasin0> community, enterprise, pro, som other version
[18:57:22] <deathanchor> isn't the only difference support models between community/enterprise?
[18:58:35] <jilocasin0> deathanchor: nope different version apparently have different features. Which support what is the outstanding question.
[18:58:43] <Derick> deathanchor: enterprise has kerberos auth
[18:59:51] <jilocasin0> and apparently the new version of Enterprise Advanced (is there an Enterprise _not_ Advanced?) supports the new Encrypted Storage Engine.
[18:59:51] <Derick> jilocasin0: asked some people internally for a matrix of some sorts
[19:01:27] <jilocasin0> Derick: Thanks, I realize the folks behind mongodb aren't a charity (well not the open source parts anyway) but at some point the marketing push is driving away dba/developers.
[19:02:46] <jilocasin0> Derick: if we weren't already using it internally (I'm a RDMS DBA/developer ) we have another who is the mongo DBA/devloper) I would have run to _any_other_ noSQL solution by now.
[19:03:38] <jilocasin0> just trying to wrap my head around it, like pulling teeth from a chicken.
[19:07:32] <jilocasin0> deathanchor: currently working with lots of postgresql, formerly lots of MSSQL & SQLite.
[19:13:54] <jilocasin0> deathanchor: git's nice, better than mercurial, better than subversion, better than timestampted zip files. (anything was better than Visual Source Safe 6)
[19:14:14] <Derick> jilocasin0: apparently, there is nothing right now that either lists all features, or has a matrix comparison. Although, apparently they're about to publish something.
[19:14:30] <jilocasin0> Derick: behind a PI paywall?
[19:14:32] <Derick> jilocasin0: happy to answer if you have any specific questions like "can mongodb do this?"
[19:16:56] <jilocasin0> current understanding (possibly wrong): mongodb amorphous collection of json like entities called 'documents'. It exists (in a file, a partition, an entire disk) somewhere and in some form.
[19:17:35] <jilocasin0> you can have multiple mongo databases across or within machines. You can shard your data somewhat like Hadoop.
[19:17:55] <deathanchor> jilocasin0: that's the gist.
[19:17:56] <jilocasin0> it has it's own language that you access through a command line mongod
[19:18:11] <deathanchor> jilocasin0: no it's basically JS scripting.
[19:18:17] <jilocasin0> you need a 'router' running on every client machine that access it.
[19:18:33] <deathanchor> jilocasin0: only if you shard
[19:18:34] <jilocasin0> there are lots of drivers in lots of langiages
[19:21:15] <jilocasin0> cheeser: got that, my question was is it like an ODBC driver, where I just include it in the program, or do I have to run/install something else, like a router?
[19:21:18] <Derick> Professional has the same features as Community, but different support (and different features) than Enterprise.
[19:27:42] <Derick> Well, it's a Document Database, focussing on scalability and flexiblity of use - over covering some traditional RDBS features such as ACID
[19:28:00] <jilocasin0> short version: lots of json like 'documents' managed somewhere with it's own vaguely SQL like language that's not ACID complient, but works well for unstructured data.
[19:29:02] <Derick> it works well for structured data too :)
[19:29:20] <jilocasin0> relationships suck, buried updates suck, but free form data creation and reads are really good.
[19:30:41] <jilocasin0> apparently, if you have a fairly complicated/nested docuement and you want to change a couple of values buried deep in the document, it's _much_ faster to simply delete the existing document and create a new one with the updated values than to parse each document to get to the value you want to change and change it.
[19:31:15] <deathanchor> jilocasin0: yeah you are better off splitting the document into smaller parts then.
[19:31:15] <StephenLynx> yeah, I do that with more complex sub documents on mongo.
[19:31:32] <StephenLynx> it leaves you open for race conditions though.
[19:31:45] <jilocasin0> kinda wreaks havok with immutable id's though.
[19:33:35] <jilocasin0> Derick/cheeser; do you know if mongodb is ever going to have a web seminar with _real_ information in it? Things that a dba or developer is interested in?
[19:34:03] <StephenLynx> I find people talking about webinars here and that
[19:34:08] <StephenLynx> I always figures what the hell they are
[19:34:21] <jilocasin0> I've been to several (including today's) and so far they are marketing fluff designed to sell Enterprise (or other commercial) versions?
[19:36:03] <Derick> jilocasin0: what sort of stuff are you interested in?
[19:36:04] <jilocasin0> it's all: "ooh look at this shiny new feature" *enterprise only, and this one will let you do that, or save this amount of money/time/etc. Look at the pretty charts,
[19:36:21] <StephenLynx> so you want the documentation?
[19:38:02] <jilocasin0> no, I can (in theory) get that. Demonstrations, tips on what to do and what to avoid, bet practices for different situations. Benefits (speed, security, performance, resources, etc.) of the 'shiny new features'.
[19:39:18] <jilocasin0> cheeser: sure, once I've been convinced that I need/like/want the shiny new feature. Webinars are traditionally (at least in the dev community) typically part show and tell and part what to do/not to do.
[19:40:55] <jilocasin0> cheeser: buried update== in a doc, there's several sub docs, and in a sub doc there are values (or other subdocs) changing a value in a subdoc, in a doc (or another subdoc, etc.) is what I refer to as a 'buried update'
[19:42:06] <jilocasin0> cheeser: yep, but in our experience it's much faster to delete and create (with updated values) than to simply update (as opposed to postgres where's it's faster to simply update than to delete/create)
[19:42:23] <jilocasin0> one or two updates, not much difference
[19:42:29] <cheeser> i have my doubts about that but it's your code
[19:42:41] <jilocasin0> 100 or 1000, delete create is faster.
[19:42:58] <cheeser> if you do lots of in place updates, using powerOf2 sizing is recommended i believe.
[19:43:11] <cheeser> if an update causes a document move, that can be expensive, yes.
[19:43:38] <cheeser> but it depends on the update. saying, wholesale, that deleting a doc and writing a new one is faster is just not true.
[19:43:52] <jilocasin0> so these are the little things that would be nice to know (.org is definately better than ,com for these sort of things it looks like).
[19:44:09] <cheeser> .org is the community site. .com is the business site.
[19:44:16] <jilocasin0> cheeser: yes, that's why I qualified my statement with 'buried updates' are slower.
[19:44:43] <jilocasin0> it could be an artifact of our document structure...
[19:45:23] <jilocasin0> but it appears that long chain relates are more problematic in some cases than subdocuments with mongo (as opposed to an RDMS).
[19:45:53] <jilocasin0> I am sure it's just a matter of finding the right balance.
[19:46:22] <jilocasin0> but at this point I'm a novice at mongodb but have several decades of experience with RDBMS.
[19:48:08] <jilocasin0> [actually I would have to learn some more to be considered a novice ;) ]