pmxbot IRC Log Viewer

[01:32:04] <pomke> Hey :) I'd like to do an update (upsert) with a match query {_id : myData.id} < Where myData.id may be null (javascript), which should indicate that a new record should be created(upsert rather than update)

[01:32:30] <pomke> What I'm getting is a new object where _id is literally set to null

[01:32:57] <pomke> Not a new ObjectID

[01:39:09] <joannac> test for that in your app?

[01:40:46] <pomke> I guess so :)

[01:41:04] <pomke> But then I could just call insert in a condition

[01:41:13] <pomke> So what is the value of upsert?

[01:44:40] <joannac> I'm not sure what you mean

[01:45:05] <joannac> if myData.id === null, insert, else upsert

[01:45:29] <joannac> the upsert is as you have it now, with a match query {_id : myData.id}

[01:45:54] <pomke> But I kind of expected it would replace the null with a new ObjectID

[01:49:10] <pomke> I'm probably doing something silly.. but I'm receiving an object that may or may not exist in the collection, and I want to: update it if it does or create it if it doesn't. I can check if it has an id, and conditionally call update or insert, I just thought update with upsert somehow wrapped those two into one

[01:50:27] <pomke> Sorry if this is a newbie question :(

[02:00:41] <joannac> upsert says "does a document with the match clause exist? if so, update, otherwise, insert"

[02:01:19] <joannac> the implicit statement there is "add what's in the match part to the document"

[02:01:41] <joannac> whereas what you want is more like if A, then B, otherwise C

[02:01:53] <joannac> where B and C have no relation to A

[02:02:10] <joannac> ...that was a bad analogy

[02:03:05] <joannac> if you want to do an upsert with query part {_id: null}, that implies you want to either (a) update the document with {_id: null} or (b) insert a document with {_id: null}

[02:03:54] <pomke> So the best approach is to just conditionally call insert?

[02:07:34] <pomke> joannac: thanks for your help btw :)

[02:40:18] <danecando> anybody here familiar with mongoose

[03:54:49] <pomke> If I have a collection of objects with an array of IDs, ie { tags : [ObjectID, ObjectID, ObjectID], ...} And I want to find all matches where the object tags array contains a specific ObjectID

[03:56:06] <pomke> I thought I'd need to do something like: find({ tags : { $elemMatch : SomeObjectID }})

[03:56:43] <pomke> But $elemMatch takes an object, I guess for doing $gt, $lt etc

[03:56:56] <pomke> but I want to match it explicity to one value

[04:01:33] <joannac> what do you want the output to be?

[04:03:10] <pomke> All the objects that have SomeObjectId in their tags array

[04:03:20] <joannac> db.foo.find({tags: SomeObjectId})

[04:03:53] <pomke> Wow ok

[04:04:07] <pomke> I was making that way more complicated than it needed to be

[04:04:19] <pomke> thanks again :)

[04:05:25] <joannac> no probs

[04:15:13] <phrozensilver> can someone take a look at my problem real quick? https://gist.github.com/rdallaire/241dd48262ec856620c1

[04:15:50] <phrozensilver> Im trying to post to my mongodb using mongoose and the name data posts fine but I can't get the nested objects to fill for some reaso

[04:16:08] <phrozensilver> wondering if my syntax is possibly wrong?

[04:16:34] <phrozensilver> or my schema

[08:27:40] <tadeboro> Hi all. Any idea why would mongo shell error out with a message: "exception: invalid operator '$day'"?

[08:27:51] <tadeboro> I'm using mongo 2.6.3.

[08:32:05] <joannac> what were you trying to run?

[08:32:46] <joannac> I don't think there's an operator called "$day"

[08:32:56] <joannac> so I would guess that's the problem

[08:44:24] <rspijker> tadeboro: you probably need $dayOfYear/Month/Week instead...

[08:48:58] <tadeboro> joannac, rspijker: Yep, you're both correct. I took the code from some online tutorials and it was bad.

[08:49:18] <tadeboro> $dayOfWeek is what I was after. Thanks for help.

[09:06:18] <ic3st0rm> i have to work with currencies. all calculations are done with bigdecimal and in the minor unit of the currency. the output is a number without comma. is it save to store this in mongodb as number? 20$ would be stored as 200

[09:07:18] <uf6667> ic3st0rm: use integers

[09:07:45] <uf6667> store $20 as 2000

[09:07:59] <uf6667> and don't forget to use spinlocks too )

[09:08:00] <uf6667> ;)

[09:08:56] <ic3st0rm> alright, but with mongoose i cant say insert it as integer, it automatically gets inserted as normal number (float)

[09:08:59] <remonvv2> ic3st0rm: If you're asking if it's safe to store any bigdecimal value in mongo's native integer format the answer is no. That said, there are no real world currency amounts that exceed 2^63.

[09:09:06] <remonvv2> So practically, it's probably safe.

[09:09:16] <ic3st0rm> not bigdecimal value but .toString() of bigdecimal

[09:09:33] <remonvv2> Why would you want to do that?

[09:10:59] <ic3st0rm> i need to store orders of my customers.

[09:11:55] <ic3st0rm> when a user buys 2 pieces for 10$ each, i calculate how much he must pay (amount * rate) and save the total

[09:14:47] <ic3st0rm> all calculations are done in cents, and i store the amount, rate and total in cents too.

[09:14:51] <uf6667> yes

[09:14:54] <uf6667> exactl

[09:14:55] <uf6667> y

[09:15:11] <uf6667> if you do everything in cent, you'll never have rounding errors

[09:15:16] <ic3st0rm> when i store it in cents there are no commas

[09:15:18] <ic3st0rm> yes

[09:15:21] <uf6667> exactly

[09:15:23] <ic3st0rm> thats what i wanted to know

[09:15:25] <ic3st0rm> :)

[09:15:28] <uf6667> :)

[09:15:33] <uf6667> need any other help?

[09:15:46] <Zelest> *yawns*

[09:16:18] <ic3st0rm> not now but if i have any further questions i would ask them here

[09:16:32] <uf6667> cool

[09:28:24] <ic3st0rm> uf667: when i do $inc and add or subtract amounts, can there be rounding errors?

[09:28:38] <kali> ic3st0rm: yes

[09:28:57] <kali> ic3st0rm: it depends on what your client language does

[09:29:42] <kali> ic3st0rm: but typically, with the mongo js shell, numbers will be floating point unless you wrap them in NumberLong()

[09:30:40] <rspijker> or NumberInt() :)

[09:33:08] <ic3st0rm> i use nodejs and mongoose

[09:33:38] <remonvv> Ah.

[09:33:53] <kali> then i have no idea :)

[09:33:57] <ic3st0rm> ^^

[09:34:42] <remonvv> ic3st0rm: MongoDB operates on int32, int64 and double as you'd expect (e.g. no rounding issues when working with int, etc.)

[09:35:35] <remonvv> JavaScript however has no explicit types for the above iirc.

[09:36:11] <ic3st0rm> calculations are done with bigdecimal, the numbers are stored in cents, add or subtract with $inc should be ok then?

[09:36:21] <dawik> actually. it has explicit types for arrays

[09:36:28] <dawik> Float32Array etc

[09:36:49] <remonvv> ic3st0rm: BigDecimal is not a thing in MongoDB. $inc operates on int32, int64 or double.

[09:36:50] <dawik> not sure about single types

[09:37:12] <dawik> types/objects

[09:37:29] <ic3st0rm> yea but bigdecimal outputs in cents so i always have numbers without decimal places

[09:37:37] <remonvv> So you'd be converting to one of those three first (most likely a double) and then store it. Alternatively it's being serialized to a string in which case...well...not sure what $inc does with that. Error probably.

[09:38:01] <remonvv> ic3st0rm: Yes, what I'm saying is that that "output" has to be converted to an integer type by you explicitly.

[09:38:07] <remonvv> Unless Mongoose manages that.

[09:38:43] <remonvv> bigdecimal is a weird type for currencies by the way.

[09:39:05] <ic3st0rm> what would you recommend? biginteger?

[09:39:50] <ic3st0rm> i saw that many people are using bigdecimal for currencies, it also has the bankers rounding (half_even)

[09:40:29] <dawik> bigdecimal is preferable for currencies yes

[09:40:33] <dawik> same with java :)

[09:40:42] <ic3st0rm> ok

[09:40:52] <remonvv> By who?

[09:42:35] <ic3st0rm> what do you mean with: by who?

[09:43:32] <dawik> I made a currency-converter app for android, and that is the conclusion i came to. from experiment and research

[09:43:56] <rspijker> well… financial data is usually required (by law) to be precise up to 5 decimal places

[09:44:16] <rspijker> so, either you have to go a lot further than cents as your basis, or you don;t use bigDeicmals

[09:44:34] <dawik> there is a method to set the amount of decimals

[09:44:41] <rspijker> A lot of things aren’t priced as integer multiples of a cent

[09:44:42] <ic3st0rm> yes

[09:45:11] <rspijker> o, sorry, I read that as BigInteger the entire time

[09:46:04] <ic3st0rm> for example: 10.20$ * 100000 is stored as 1020000, enough decimals?

[10:10:42] <remonvv> ic3st0rm: Can I ask what it is you're building?

[10:11:00] <remonvv> ic3st0rm: It's enough decimals for anything that doesn't do the actual financial transactions.

[10:13:12] <dawik> rspijker: wouldn't the law state that it should be no less than 5 decimal points?

[10:13:40] <rspijker> dawik: yeah, not up to, but at least

[10:14:00] <dawik> thought so :)

[10:14:23] <ic3st0rm> i plan a small site where users can buy/sell stuff

[10:24:31] <remonvv> rspijker: Where did you get that? Afaik that's only common for systems that manage funds or transactions and I'm not aware of any global laws for currency storage accuracy beyond that. Also, I think that's decimal places for the major currency unit (e.g. USD, not USD cents).

[10:26:56] <rspijker> remonvv: certainly not global. Fairly sure the local laws here foverning accountancy practices make some requirements about accuracy. And it would indeed be on the major unit (I don’t think I claimed otherwise?) although that wouldn’t really matter conceptually.

[10:30:15] <remonvv> rspijker: I didn't mean you claimed otherwise, was just curious about the requirements. I think the law is more about how transactions themselves are handled (which may affect storage requirements). E.g. if you add one amount to another (or have to multiply for discounts, interests and so on) the operation has to be performed with specific accuracies and rounding rules. Interesting topic though. I (briefly, zzzz) worked for a comp

[10:31:24] <remonvv> rspijker: And it doesn't matter conceptually but it does affect how "valid" int64 is as a currency storage type.

[10:32:24] <rspijker> remonvv: I doubt there is any currency for which you will realistically get to the bounds of int64 ;) Althoug supposedly you could if you have some hyperinflated currency…

[10:32:27] <remonvv> rspijker: I'm not intimately familiar with any laws governing this by the way, it's an interesting topic though.

[10:32:46] <rspijker> and currency like bitcoin which is, theoretically, infinitely dividable...

[10:32:49] <remonvv> rspijker: So if ic3st0rm is making an e-shop for a prince in Zimbabwe he has to worry eh

[10:33:03] <rspijker> that was the hyperinflation country i was thinking of, yes :P

[10:33:46] <rspijker> it is indeed interesting. I work for a company active in the billing/rating/invoicing space and so I deal with it a little

[10:33:55] <rspijker> although the legal/compliancy stuff is handled by someone else

[10:34:59] <remonvv> rspijker: I worked for a few more involved with it and I've researched into HFT systems and bank systems in general but that's about it. Currently we do little more than integrating with payment providers (which incidentally usually deliver amounts in cent accuracy).

[10:36:45] <rspijker> we deal with a couple of telcos. They usually charge something like 5 cents per MiB for instance, but the information we get (CDRs) has quantities in KiB and they do say in the fine print that they charge per KiB (sometimes per 10KiB for some reason). So you get prices of 5/1024 cents

[10:36:48] <remonvv> I'd always advise against BigDecimal/BigInt types in currency handling code and in favor int64. The amount of subtle bugs that can surface when having to move BigDecimal like values between different systems is daunting.

[10:37:19] <remonvv> Ah, interesting.

[10:37:34] <rspijker> even worse with some IaaS customers

[10:37:50] <remonvv> I can only imagine.

[10:37:55] <rspijker> they charge for CPU usage, for instance, they have a price per month, but sometimes want to bill per second

[10:38:05] <rspijker> so the amounts get small :)

[10:38:19] <rspijker> anyway, lunch time :)

[10:38:22] <remonvv> Same

[10:38:26] <remonvv> ic3st0rm: Good luck ;)

[10:39:18] <ic3st0rm> thanks ^^

[11:03:10] <ernetas> Hm...

[11:03:31] <Derick> hi!

[11:03:40] <Derick> having any luck with data recovery?

[11:04:28] <ernetas> Hi! mongorepair failed, so did mongodump, maybe because of missing .ns file. I'll later try to recover that too, although I'm not sure if that will be possible.

[11:04:52] <ernetas> I'm now trying to use https://github.com/MongoHQ/purplebeard to extract BSON data, but I'm getting "UnicodeDecodeError: 'utf8' codec can't decode byte 0xff in position 0: invalid start byte". Crap!

[11:05:10] <remonvv> Uh oh.

[11:24:17] <ernetas> Looks like there isn't going to be a(n easy) solution?

[12:55:51] <denispshenov> Hi everyone. Could you please look at the following question and see if you can help, thanks. http://stackoverflow.com/questions/25059249/how-to-match-multiple-subdocument-in-mongodb

[13:01:18] <rspijker> denispshenov: well… the two $’s won’t work. Because you have no guarantees that the reader or like is found...

[13:01:49] <rspijker> you can do it with aggregation and a $unwind

[13:02:01] <rspijker> well, two $unwinds

[13:39:37] <squish> Hi, I'm just learning how to use mongodb via pymongo, and I have a beginners question: Is there something like the .distinct() method, that returns a list of possible keys from within a collection? distinct() only returns values.

[13:41:06] <squish> Is anyone here?

[13:41:18] <squish> hrm.

[13:41:40] <Derick> plenty of people here

[13:41:59] <Derick> distinct is a command in MongoDB

[13:42:57] <ic3st0rm> http://docs.mongodb.org/manual/reference/method/db.collection.distinct/

[13:43:49] <rspijker> you want distinct keys instead of distinct values for agiven key squish ?

[13:44:28] <rspijker> if so, there is nothing like that

[13:45:11] <squish> No. Distinct values for a key would be nice too, but my question was how to get all possible keys from within a collection.

[13:45:41] <ic3st0rm> rspijker: do you think he can use $group for that?

[13:45:50] <squish> And distinct is a method in pymongo.

[13:46:48] <rspijker> distinct will get all distinct values for a given key in a collection

[13:47:18] <squish> I know.

[13:47:24] <rspijker> there is no way to get all existing keys from within a collection

[13:47:36] <rspijker> short of retrieving every document and iterating every single one

[13:47:59] <rspijker> ic3st0rm: you can use $group to get distinct values, but distinct already does that directly

[13:48:01] <Derick> i think stackoverflow has a map/reduce job to do that though

[13:50:51] <squish> Okay. I thought I must be missing something, because this functionality seemed like something that should exist. But if it doesn't I can learn on. Thank you, guys.

[13:52:43] <rspijker> squish: mongo is schema-free.. So the amount of _possible_ keys in a certain collection is nearly unlimited

[13:52:50] <feathersanddown> Hi

[13:52:55] <rspijker> the ones that actually exist are probably very limited

[13:53:12] <rspijker> in practice, if you have a schema. It;s actually very easy to do with javascript in the shell

[13:53:16] <feathersanddown> the $text selector and text indexes are made for huge texts ?

[13:53:42] <rspijker> simply go: varx=db.coll.findOne(); for(var f in x){print(f)}

[13:53:54] <rspijker> var x*

[13:54:05] <feathersanddown> I mean... if I want to search an user name can I use $text selector ?

[13:54:39] <feathersanddown> or is made for text search inside a lorem ipsum style text ?

[13:55:05] <rspijker> feathersanddown: I don’t udnerstand the distinction

[13:55:46] <feathersanddown> the scope of $text selector

[13:56:18] <feathersanddown> or is not important to use in short texts or huge texts

[13:56:28] <rspijker> it’s not

[13:57:01] <feathersanddown> a user name like "foo_name bar_second_name" and to search for "second" text

[13:57:05] <squish> This would return the keys for a single object. My problem is more, that I have a collection which includes documents with slightly different schemas. Some have a few Keys added, which others don't.

[13:58:09] <rspijker> squish: then it’s more difficult. Still quite easily doable. But you would have to iterate over every document to make sure you get all the keys...

[13:59:00] <rspijker> squish: something like this http://pastie.org/private/5dizeq7tiyuifjms5qwva

[13:59:07] <squish> This was my first idea, but I thought, that there might be some server-side method doing exactly that.

[13:59:14] <rspijker> if your collection is large, it will take a long time...

[13:59:25] <rspijker> nope, there isn’t

[14:00:29] <squish> Okay. Again, thank you for the help.

[14:01:54] <squish> bye.

[14:23:35] <kali> http://blog.fotopedia.com/fotopedia-shutdown/

[14:54:14] <feathersanddown> is possible to create a text index in text fields that are inside object inside collections ?

[14:55:12] <feathersanddown> db.collection: { object_foo: { field_to_search: "any text", field_to_search2: "any other text" } }

[14:55:34] <feathersanddown> create text index over "field_to_search" and "field_to_search2"

[14:55:39] <feathersanddown> is possible ?

[15:01:07] <ruairi> feathersanddown: Inside a collection, you have documents. Inside those documents, you have fields. Are you asking if you can create a text index on a field in a document?

[15:02:14] <feathersanddown> fields inside documents that are inside documents

[15:02:39] <feathersanddown> seems text fields inside a document only

[15:03:19] <rspijker> feathersanddown: you need to specify the field(s) to index

[15:03:34] <rspijker> if that’s a subdocument, you can user the dot notation to specify it

[15:03:46] <rspijker> in your case: object_foo.field_to_search

[15:04:36] <feathersanddown> db.collection.object_foo.field_to_search

[15:05:06] <rspijker> no…

[15:05:19] <rspijker> you create the index already on db.collection, so it's relative to that

[15:05:57] <rspijker> db.collection.ensureIndex({object_foo.field_to_search:”text”, object_foo.field_to_search2:”text”})

[15:06:13] <rspijker> that would create a text index on those 2 fields inside of the document

[15:07:24] <feathersanddown> uhm

[15:08:11] <feathersanddown> "unexpected token ."

[15:08:39] <feathersanddown> inside ""

[15:09:21] <feathersanddown> aaahhh finally works

[15:09:23] <feathersanddown> :)

[15:11:39] <feathersanddown> index over _id is a default index ?

[15:12:07] <rspijker> yes

[15:19:53] <feathersanddown> is droppable ?

[15:23:57] <rspijker> feathersanddown: nope

[15:28:14] <feathersanddown> i'm still wondering if I'm right to use $text (and text index) to do short text fileds searchs

[15:28:25] <feathersanddown> *fields

[15:28:43] <rspijker> are you querying for equality or for contains?

[15:28:52] <feathersanddown> fields like "user name", "location"

[15:28:54] <feathersanddown> contains

[15:29:28] <feathersanddown> a user named "peter" in a "user name" field

[15:29:40] <feathersanddown> a list of users too

[15:29:48] <rspijker> can you make the list of users an array?

[15:30:17] <feathersanddown> oh no, I mean, a returned list from a keyword match

[15:30:55] <rspijker> I’m not sure what you mean by that

[15:30:56] <feathersanddown> but for fields that aren't too lenght

[15:31:06] <feathersanddown> max 100 chars

[15:31:34] <feathersanddown> to me, a text search should be done in a "lorem ipsum" huge text

[15:35:07] <rspijker> well, you don;t have to use text indices...

[15:35:16] <rspijker> you can search on equality normally, you can use regexes

[15:35:24] <rspijker> they can even be indexed if you are searching the start of a string

[15:35:34] <rspijker> (regexes, that is)

[15:35:55] <rspijker> without knowing your usecase any better than the vagueness you are giving me atm, I can’t tell you whether or not that will work for you

[15:37:49] <feathersanddown> ok thanks

[15:37:50] <feathersanddown> :)

[15:39:08] <rspijker> I gtg, good luck :)

[15:45:38] <flippyhead> Hi! Anyone have suggestions on how to "bundle" mongodb with an OSX Application ?

[15:58:42] <culthero> Hello, can you somehow use a b-tree query on a range to reduce the scanned results on a full text field in the same collection? For instance you have a timestamp, then a text field containing tweets?

[15:59:55] <culthero> When I do an explain just using the btree, it is lightning fast, but if I add in the $text: {$search: "text search here"} }) it scans over all records inside the fulltext index, rather then the subset of records in the index that fall within that range

[16:00:12] <culthero> erm, all the records that contain "text search here"

[16:00:20] <culthero> meaning that no matter what, as the data set grows the search will slow

[16:00:26] <dawik> use a string as key

[16:00:49] <dawik> and check that, and using an incrementor as an arugment (one way)

[16:01:10] <dawik> one way of doing what you want, i believe

[16:02:21] <culthero> Use the string as a key? Each document in the collection as 3 indexes, _id, inserted(date), text(fulltext)

[16:02:33] <culthero> doing b-tree queries = fast

[16:02:43] <dawik> oh i thought this was ##C

[16:02:49] <culthero> :)

[16:02:49] <denispshenov> Everyone, if you have time could you please see this question again. http://stackoverflow.com/questions/25059249/how-to-match-multiple-subdocuments-in-mongodb

[16:02:55] <dawik> my bad :O

[16:02:58] <culthero> np

[16:58:24] <culthero> So before I rebuild a 10gb index, does someone know if a compound index on a fulltext field + a b-tree field (such as a date) will be.. what is the word, selective? (IE, range + text)

[17:42:27] <Almindor> hello

[17:43:02] <Almindor> does anyone know how to convert a BsonDocument into a class? It works directly with Cursor<myclass> but I need it to be done one by one due to possible errors

[17:43:19] <Almindor> that's in the C# driver

[17:53:58] <Es0teric> anybody here?

[18:00:28] <culthero> i am

[18:00:57] <daveops> i'm not

[18:03:22] <Es0teric> culthero: i have some keys like _id, books (array), genre

[18:03:35] <Es0teric> how do i find all results in books?

[18:07:36] <Almindor> Es0teric: what do you mean find all results in books? find all documents but just get the books field?

[18:07:53] <Es0teric> Almindor: yes

[18:08:29] <Almindor> Es0teric: db.yourcollection.find({ ... }, { "books": 1, "_id": 0 });

[18:09:47] <Es0teric> Almindor: what do you mean by { … }

[18:09:48] <Es0teric> ?

[18:10:04] <Almindor> Es0teric: any query filters for the documents themselves you need

[18:10:10] <Almindor> Es0teric: if you need all, just go {}

[18:30:16] <Es0teric> Almindor: -> 2014-07-31T14:27:05.582-0400 SyntaxError: Unexpected token .

[18:34:11] <mikebronner> Es0teric: one sec

[18:34:16] <mikebronner> i'll get you a query

[18:34:36] <cozby> hi guys, I have a simple app that connects to mongodb (via mongoose) and shoves data in it. It inserts quite a bit of records, and I'm seeing a lot of Error: connect ETIMEDOUT when connecting to my mongo instance

[18:34:36] <mikebronner> Books::all() should get you all books

[18:34:43] <Es0teric> mikebronner: correct

[18:34:46] <mikebronner> if using Moloquent

[18:34:49] <cozby> it connects and inserts fine for about an minute then I start seeing a lot of those timeout errors

[18:34:59] <Es0teric> mikebronner: but i am trying to get arrays from a key

[18:35:03] <cozby> anything I should be looking out for on the mongo side of things?

[18:35:10] <cozby> I'm using 10gens AWS image

[18:35:23] <cozby> so all the OS sysctl changes have been made

[18:35:34] <Es0teric> so doing Books::find(‘…’, [‘book’]); wont give me the arrays

[18:36:15] <cozby> I'm looking at lsof -i and I'm seeing a lot of connections in the established mode ~34 for mongo

[18:36:30] <cozby> is it possible all mongo connections are exhausted?

[19:17:51] <Almindor> \quit

[19:35:02] <jamesaanderson> I have a bikes collection and a rentals collection. My rentals collection has a bike_id and my bikes collection has a user_id. How should I go about finding all rentals with a status of "pending" and bike user_id of "xyz"? Am I using MongoDB too much like a relational DB and if so how should I fix this?

[19:37:28] <stefandxm> i am more curious what bikes you have :)

[19:41:23] <jamesaanderson> Haha just a road bike. With a relational db I could do a join but obviously that isn't possible in MongoDB so I'm thinking I'm going have to redesign my schema somehow

[19:42:44] <stefandxm> dunno. i dont use mongodb for 'that' :)

[19:43:12] <stefandxm> road bikes are no fun

[19:57:56] <jonyfive> hello, does anyone know how to register for an exam? https://www.mongodb.com/products/training/certification says "Opens Jul 31" but there's no link to click or anywhere i see to register

[20:10:43] <BlakeRG> does Mongo generally perform better on many smaller shards or a single beefy machine

[20:22:33] <pasichnyk> I have a large collection (currently 500gb and about 500M documents but growing), and all documents have an ISODate() property, which i need to do filtering queries on. In this case the ISODate() is an exact date of an event, however when i do my queries, i'm ok with simply daily resolution (i.e., find me events from 2 days ago). I'm scared to add the ISODate() into my index, for fear that

[20:22:33] <pasichnyk> the cardinality of it will bloat the index. Is it suggested to add something with a better cardinality (i.e., just a date without further precision) and index on that instead? That would obvioulsy take more storage space but seems like it might be a better choice? Suggestions appreciated. :)

[20:24:31] <stefandxm> tbh, i dont think you will save much

[20:24:42] <stefandxm> an isodate should easily fit a 64bit integer

[20:38:06] <feathersanddown> Can I search for a subdocument field and then return only subdocuments instances ?

[21:00:37] <pasichnyk> stefandxm, but does the fact that the cardinality is super high (new events coming in constantly) mean that index is going to be super huge?

[21:11:04] <daidoji> pasichnyk: stefandxm is right, an index should easily hold an isodate, however, you are correct in that typically its not a good idea to index by date due to the index b-tree depth increasing with cardinality

[21:11:24] <daidoji> pasichnyk: thus, as you get more events, your index depth grows, which is what you probably don't want

[21:12:25] <pasichnyk> daidoji, yeah, i really just want to be able to go limit down documents i'm looking at by a date or date range, so it doesn't have to scan through .5TB (and growing) of documents everytime i need to find something that i know the date of.

[21:13:05] <daidoji> pasichnyk: then its probably best to index by a key that's a truncated version of whatever the most common type of date range you're looking for, but you can keep it in ISODATE format

[21:13:16] <pasichnyk> daidoji, because with the precision of ISODate() basically all of my documents will have unique datetimes... :/

[21:13:24] <daidoji> right

[21:13:56] <pasichnyk> ok, so store two ISODate()'s, one with full precision of datetime, and then one as just the date or date+hour even?

[21:14:22] <daidoji> pasichnyk: or better yet, just store it in 64bit integer as stefandxm suggests

[21:14:47] <daidoji> pasichnyk: depends on the granularity you want in your date range

[21:14:48] <pasichnyk> doesn't ISODate store as that underlying anyway? WOudl it make much difference?

[21:15:06] <daidoji> pasichnyk: not sure, don't nkow how the query optimizer determines that stuff for Mongo

[21:15:20] <pasichnyk> yeah, i think daily would be good, but hourly would be even better, because then i could chew off smaller chunks of work

[21:15:41] <daidoji> pasichnyk: yeah, well index things by epoch date truncated to the hour I guess

[21:15:47] <daidoji> that's what I'd do anyways

[21:16:46] <pasichnyk> ok, i'll readup on storing dates and the formats people use, along with pros/cons

[21:19:06] <pasichnyk> looks like from the Bson spec... UTC datetime - The int64 is UTC milliseconds since the Unix epoch.

[21:40:04] <daidoji> pasichnyk: then you're probably okay indexing on that with the truncation applied beforehand

[21:48:08] <stefandxm> i would just index it as-is

[21:49:22] <stefandxm> generally, you will want the same cardinality on the index as you use in your query

[21:49:35] <stefandxm> dont know how 'smart' mongodb is in this tho

Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 31st of July, 2014