[03:16:09] <Lonesoldier728> anyone here use mongoose? trying to figure out what a reference looks like in a doc - does it store just the object id or the whole doc
[03:17:52] <Boomtime> @Lonesoldier728: i don't use mongoose, but i expect you'll get some background info from this; https://docs.mongodb.com/manual/reference/database-references/
[03:19:02] <Lonesoldier728> ok looks like just an id
[04:51:02] <YokoBR> i'm getting First key in $update argument is not an update operator when i try to findOneAndUpdate
[06:25:29] <YokoBR> i'm getting First key in $update argument is not an update operator when i try to findOneAndUpdate
[06:55:32] <febuiles> Is it possible to do $push and $pull in a single update?
[06:55:47] <febuiles> (operating on the same property)
[07:29:58] <ayee> The automation agent is grabbing my hostname and passing it around to other automation agents. Though my hostname isn't resolvable by dns. It's just some random string. Is there a way to have the automation pass around the ip address instead?
[07:38:27] <ayee> Boomtime: thanks a lot, that helps
[07:38:48] <ayee> Boomtime: I also have stale agents that have been deleted in ec2, but I can't get rid of them in the webui. any tips?
[07:42:00] <ayee> also I can't find that preferred hostname setting. I'm using the mongodb ops manager, is this only available in the cloud manager perhaps?
[07:46:07] <ayee> hmm, I can't find 'group settings' anywhere. If I click on admin on the top right, I see 'Groups', but not 'Group Settings'
[07:46:31] <Boomtime> well, you shouldn't be using ops manager without a subscription, and since you definitely wouldn't ask here without it, you can raise a support ticket with your subscription
[07:48:28] <ayee> I just installed ops manager today, I'm on the 30 day trial
[07:48:46] <ayee> I'm trying to play/test it out, I haven't gotten to the stage where I can spin up a replica set yet
[07:49:25] <ayee> I actively selling the tool to management though, I think they'll let me pay for it. but I have to show a working demo first. heh
[07:50:22] <ayee> Is the trial version different than the subscribed version, meaning buttons are missing?
[07:50:56] <Boomtime> well, you know how to do what you want in cloud, and it's much easier to set up (since there is nothing to configure or provision) - why don't you just demo that? the UI is nearly identical
[07:52:40] <ayee> They'll want to see an on prem version, working with our prod images, in our private VPC, and working on aws and openstack.
[07:52:58] <ayee> We're not allowed to use cloud services, it has to be on prem.
[07:53:12] <ayee> I could just show them heroku if I could use a cloud service, that would be easier.
[08:10:10] <ayee> ahh, I was confused. I was going to admin -> groups. I should be going to settings -> group settings. Boomtime what value do I add, ends with or regex? and how do I specify an ip for all agents. This is confusing.
[09:44:31] <ayee> hmm I still get host unreachable with ^10.* in my Preferred Hostnames .
[11:17:22] <kurushiyama> ayee: Uhm, ips aren't exactly hostnames, are they? How does your name resolving work?
[13:09:37] <Mmike> Hi, lads. I'm trying to find out when opIdMem option was removed from mongod, does anyone knows?
[13:30:30] <grug> this probably isn't the right place to ask but basically i have a nodejs application that connects to a mongo database using the node mongo driver. i am querying a large collection and my results stream prematurely terminates any time i try to stream results from a query: http://stackoverflow.com/questions/37188184/mongo-connection-terminates-early-unexpectedly even if someone could point me in the right direction (can't see a channel for the mongo driver) ...
[13:30:30] <grug> ... that would be much appreciated! (code is in the SO link i posted)
[14:49:25] <oky> grug: what is your batch size? is it relative to the number of docs? (as in, is it stopping after one batch?)
[14:55:33] <grug> oky: nah it's not stopping after one batch - the batch size is 1000 at the moment
[14:55:58] <grug> changing the batch size to be more only slightly increases the amount of docs that get processed before the stream ends (i.e. it's not linear)
[15:13:41] <oky> maybe there's a bug in the way you are filling currentBatch and dispatching it - maybe check how many times it gets filled (+ how big it is) and so on
[15:13:48] <oky> and then also count the number of docs coming over the mongo stream
[15:15:22] <grug> yeah that's what i've been just messing with at the moment. i think it must be a bug with my logic in how i fill my batches and dispatch them because if i remove the amazon upload stuff and purely just count documents coming in over the stream (i.e. in the stream.on('data'... section) then it gives the correct count
[15:15:44] <mylord> how do you find inside the 0th array element of transactions? db.payments.find({"transactions[0].amounts.total”:”1.00”}).pretty()
[15:15:45] <grug> so i'm just trying to work out where the flaw in the logic is - need to do a count of the docs coming in compared ot how many were sent
[15:16:59] <oky> grug: good luck - add more and more prints :-D
[15:18:20] <oky> grug: do you dispatch when the stream ends? (what do you do with the remaining items in the batch)
[15:18:44] <oky> anyways, ttyl - i'm sure you'll figure it out
[15:19:32] <grug> well the problem is that i can't store all the documents in memory - node falls over, i believe, otherwise i'd just store it all in memory and dispatch once the stream ends
[15:20:55] <grug> i do pause the stream, so i imagine that stops the 'end' event from being fired
[15:42:13] <kurushiyama> oky: Could it be that the processing of the "stream" takes quite a while?
[15:43:29] <grug> kurushiyama: were you referring to me or oky?
[15:43:41] <grug> i'm not sure what you mean by the 'processing' of the stream
[15:44:42] <kurushiyama> grug: Sorry. yes. Ok. you read from a cursor, and after a while that reading terminates, as far as I got it?
[15:47:11] <kurushiyama> grug: Or, to put it different: You do a query and iterate over the result set and then suddenly the connection breaks?
[15:49:06] <kurushiyama> grug: How long does it take for the connection to break?
[15:49:09] <grug> the 'end' event is fired before i am done processing my results
[15:49:23] <grug> kurushiyama: depending on how many documents are returned by the query, anywhere between 1 minute-5 minutes
[15:50:29] <kurushiyama> grug: Hm. You might want to disable the cursor timeout just to make sure.
[15:51:34] <kurushiyama> grug: May I ask what you trying to achieve? Maybe we can find an aggregation or something like that, which does not force you to iterate over a large result set?
[15:53:37] <cpama> just wondering if anyone can shed some insight into this: http://stackoverflow.com/questions/37189133/mongo-php-app-error-fatal-error-uncaught-exception-mongoexception-with-messa
[15:55:01] <grug> kurushiyama: i have disabled the timeout - it doesnt do anything, unfortunately. what i am trying to achieve is that for any documents in my collection, any of them that have to be added to an AWS search index are uploaded
[15:55:07] <grug> unfortunately an aggregation won't help here
[15:55:23] <grug> because there is always the chance that there will be a large set of documents that need to be added to the search index
[15:56:21] <kurushiyama> grug: Do you have MMS charts so we can identify the problem?
[15:58:05] <kurushiyama> grug: Uhm, I do not get it. You have multiple requests? But expect a single result set?
[16:02:46] <grug> kurushiyama: i never said that. i have a query that has about 1.3 million results, which i stream since it's too big to fit in a single response (due to bson size limits) - what is happening is that i want to batch up the responses (i.e. every time a data event is fired, i add the document that is fired to a batch) and upload them in batches of X (which may be 1,000 for arguments sake) to my AWS service
[16:03:04] <grug> the problem is that the db fires the 'end' event while i am still processing documents from 'data' events
[16:04:47] <oky> "8:13 grug | i do pause the stream, so i imagine that stops the 'end' event from being fired"
[16:04:48] <kurushiyama> grug: Hm. I am no node expert, tbh.
[16:06:02] <oky> grug: oh, i was just appreciating how the problem solving went. it started with stating your assumptions, then drilling down and figuring out that the assumption wasn't valid
[16:06:34] <oky> the statements: "i imagine it behaves this way", followed by 30 minutes investigation and then "it does not behave this way" made me laugh
[16:07:55] <grug> well im still not entirely sure whether my assumption is valid or not
[16:12:21] <oky> grug: hopefully i'l get more lols, then :P
[16:15:56] <grug> oky: ok so i added a .on('error', function()... handler and got the following error
[16:16:05] <grug> { [MongoError: cursor killed or timed out] name: 'MongoError', message: 'cursor killed or timed out' }
[16:16:23] <grug> so it has to be a timeout thing... but it doesnt seem to matter what i set timeout values to, it still does the same thing
[16:16:36] <grug> and that error doesnt fire every time either
[16:16:52] <grug> sometimes the cursor ends without an error, but doesnt process everything like i expect
[16:23:16] <mylord> How to find an element in myarray[0].value == “x” ?
[16:37:14] <ioggstream> hi @all. anybody knows why the explain() indexOnly attribute has been removed in 3.0 ?
[17:10:23] <Bookwormser> What replaced auth=true/false in Mongo 3.2? I'm trying to enable username password authentication in the conf file, but i've not been successful.
[17:52:35] <mylord> how do you find in 0th position in a array?
[17:53:31] <mylord> ie, how to find({“myarray.0.amount”:”123” })?
[18:02:05] <kurushiyama> mylord: Iirc, you can not rely on array order. If you have to, there is most likely something wrong with your data model, the first place.
[18:04:35] <cpama> hi all. i need some ideas on where I might have gone astray with my PHP code. Trying to insert an array into a collection. I've tested to make sure it's legit json. I've manually added the data to the collection using robo mongo and it seems to work.
[18:04:41] <cpama> but i can't get the php code happy
[18:14:33] <cpama> Derick so i shouldn't be doing that?
[18:14:51] <kurushiyama> Why is it that the most unfortunate documents always seem to be the most read? ;)
[18:15:00] <cpama> kurushiyama, ok. i will do that. create a global connection object and reuse that...
[18:15:36] <cpama> Derick, kurushiyama so which document/tutorial should i follow?
[18:15:39] <Derick> cpama: you are however using the old deprecated driver, so if this is a new project, please please use the new one: https://github.com/mongodb/mongo-php-driver / https://pecl.php.net/package/mongodb - perhaps in combination with the library: http://mongodb.github.io/mongo-php-library/ / https://packagist.org/packages/mongodb/mongodb / http://mongodb.github.io/mongo-php-library/
[18:16:09] <Derick> cpama: just let the driver generate _id for you, if you don't have another candidate key yourself
[18:16:20] <kurushiyama> cpama: Here is my advice: KISS, and KISS a lot. Use as little moving parts as you can possibly manage.
[18:16:25] <cpama> Derick, I'm limited to this one for now... because I don't control the environment I build in. However, I have asked my sys admin to look into the updates for me
[18:17:02] <cpama> Derick, so if i insert a PHP array without the "_id" it should just create one, no?
[18:17:04] <Derick> cpama: you're using an old PHP version too
[18:17:13] <Derick> cpama: yes - it will create an ObjectID value as _id then
[18:17:41] <cpama> ok. and here I thought I was being a good little programmer by reading the docs.
[18:19:14] <kurushiyama> cpama: Which is good. I would suggest playing around with the shell a bit, scrap any GUI tools and get used to MongoDB. One of the best parts you can read is https://docs.mongodb.com/v3.0/applications/data-models/
[18:23:18] <kurushiyama> I just say "aggregation pipeline"...
[18:29:05] <cpama> Derick, is there anything else I need to do after the insert? i've changed all "." to "-" and am using this for my insert statement:
[19:30:02] <erg0dic> if I run an aggregation on mongo 2.4, is there no way to send the result of the aggregation straight into a collection without first loading it in memory?
[19:32:47] <kurushiyama> erg0dic: You mean on the server?
[19:35:40] <erg0dic> kurushiyama: correct, like I run db.collection.aggregate([pipeline]) in the shell on the server
[19:36:59] <kurushiyama> erg0dic: Uhm, so how should that work, given the premise that the output of the current stage is supposed to be the input of the next?
[19:37:51] <erg0dic> so in 2.6 there is an $out stage operator that sends the result set straight to a collection
[19:38:36] <erg0dic> but it does not exist in 2.4, further db.collection.aggregate([pipeline]) does not appear to be able to return a cursor so you get an array of the entire result set at once
[19:39:22] <kurushiyama> erg0dic: Right, but still it is in RAM, sorta. On the server. What prevents you from updating, If I may ask. 2.4 is archeology, so to say. 2.6 is almost at EOL.
[19:40:01] <erg0dic> circumstances beyond my control :)
[19:41:33] <kurushiyama> Daring decision. Well, you have to live with what you are given, and walk down that line. I'd report back the estimated development time and impact on UX and drop a note that this functionality is implemented in 2.6
[19:42:43] <kurushiyama> Maybe with the additional info that 2.4 is not supported any more, should TSHTF.
[19:47:44] <deathanchor> erg0dic: I get around that by coding up my own queries that update another collection
[19:48:07] <deathanchor> the old aggregator was fun to play with, but sucked for performance and large result sets.
[19:49:12] <ayee> kurushiyama: /etc/hostname, or `hostname -f` has a random string that gets returned. It doesn't resolve to any name servers. So I want the automation agent to use the ip on the interface.
[19:50:12] <kurushiyama> ayee: Uhm, sorry. Can you describe the problem, again. It has been a while... ;)
[19:50:44] <erg0dic> kurushiyama: thanks anyways! our team is well aware of the limitations, its only a matter of time
[19:51:28] <ayee> kurushiyama: The ops manager / automation agent(s) combination is grabbing `hostname -f`, and passing that around and trying to connect to it. WHen I try to spin up a replica set of 3, I get host unreachable everywhere.. and I see 'host' as the random string from `hostname -f`.
[19:51:41] <ayee> I want the automation agents to use the interface ip, and host hostname -f
[19:52:18] <kurushiyama> ayee: Basic requirement. Hostnames are required to be properly resolved. Period.
[20:26:47] <jr3> is there a way to write a query that find a document, selects a property, and then sets a new property based on the selected property?
[21:16:22] <oky> that's with 2.6, though - so just curious what other numbers are like
[21:17:35] <kurushiyama> oky: I rarely do full table scans.
[21:18:27] <oky> kurushiyama: do you do time series queries? when i do a group by time, mongo seems to want to do full scans (regardless of if i'm filtering my docs or not) - i will take a look at the query plan and see what's up in a bit
[21:18:39] <kurushiyama> oky: If you'd give me a sample doc, I could generate an according number of docs and see what performance I can acquire for a given aggregation
[21:19:10] <kurushiyama> oky: Depends on your indices, ofc.
[21:19:21] <oky> kurushiyama: sample doc: {hostname: "blah blah", status_code: 200, timestamp: Date.now() / 1000}. i will verify the indeces
[21:37:44] <kurushiyama> oky: Ok, docs are generated now.
[21:38:53] <kurushiyama> oky: If you want to group by time, I'd always use an according index. Given the fact that ISODates are stored as 64bit ints, even millions of entries translate just to a few MB.
[21:40:33] <kurushiyama> oky: So, how does your aggregation look like – and what do you want to achieve?
[22:07:31] <kurushiyama> oky: I assume roughly something like http://hastebin.com/zejelizenu.sm ?
[22:08:51] <oky> kurushiyama: sure, looks good to me
[22:10:12] <kurushiyama> oky: Still generating. 3.5M atm, still some way till 10M ;)
[22:10:28] <oky> kurushiyama: what's a query on 1M look like?
[22:11:29] <kurushiyama> Let us give it another cigarette for me, then we'll find out ;)
[22:23:08] <kurushiyama> oky: Unoptimized, on my heavily loaded laptop, 32277 msecs, with 4506002 documents. That is 0.00716311266617msecs/doc or 7.163... microseconds/doc
[22:28:19] <kurushiyama> oky: We should stick to your use cases, however ;)
[22:28:54] <kurushiyama> So we put a match on "hostname", then?
[22:31:29] <oky> sure, but... this was an example dataset with small amount of fields. query usually looks like: "SELECT AGG(col1), AGG(col2) FROM dataset WHERE filter1, filter2, filter3 GROUP BY time_bucet, dim1, dim2, dim3;
[22:31:49] <oky> the fields being filtered on are usually arbitrary, wouldn't know them ahead of time - but let's say i make an index of every field
[22:32:33] <oky> i think 32s for 4.5 million seems like too much, should be more like 3seconds
[22:32:35] <kurushiyama> oky: That is not ideal, to say the least. Unindexed matches always lead to a collection scan
[22:33:06] <kurushiyama> oky: My Laptop is quite heavily loaded, mind you, has only 4GB of RAM and the dataset was not preheated ;)
[22:33:35] <kurushiyama> oky: Plus, we iterated over 4.5M docs. All of them
[22:35:29] <oky> yeah, sure - i think the laptop being loaded is a problem
[22:35:46] <kurushiyama> oky: And still we are talking of some 7 _micro_seconds per doc.
[22:38:43] <kurushiyama> oky: What I would suggest for optimizing is to narrow down the time frame. For example for charts, I limit the date range. That is where you can get the biggest performance boosts. Another option is to do preaggregations.
[22:39:25] <oky> kurushiyama: if you limit the date range does it not do the full table query? i think that's what i ended up having trouble with
[22:40:48] <oky> kurushiyama: thanks for looking into it
[22:40:59] <kurushiyama> oky: It should not. However, you need to do an early match. Let's say you want the statuses for a hostname named "a" on May 6th: db.example.aggregate({$match:{"hostname":"a",date:{$gt...,$lt}}...)
[22:41:45] <oky> it seems like the perf boosts come when aggregating across multiple machines
[22:43:01] <kurushiyama> oky: Well, that heavily depends. Making proper decisions with early matches, limiting the fields passed to the next stage to the bare minimum (hard to optimize with you data model) and narrowing down to what you really want helps a lot.
[22:55:56] <kurushiyama> oky: With something like that, I am even able to do reduce the aggregation for a single day to slightly over a second
[23:08:10] <kurushiyama> oky: And we are talking of an average of 563250.25 events/day. Honestly, I can not see this as slow. Not the slightest. Especially on my rather crappy laptop.