[13:06:36] <zzz> Hi, i' m trying to execute a command in mongodb from the unix shell, however i' ve access only to the admin user, and not to the current db. When executing a command, how can i specify the db as well? At the moment i' m using this: mongo host:port/admin --username name --password pass --eval 'db.coll.find()', however this should be executed on the misc db, and not on the admin one. How can i achieve this?
[14:26:20] <rc3> hi guys, I am trying to import data from json file, example line: {S: "0P00000003", D: {"$date": "2015-01-30T00:00:00.000Z"}, O: 7.502000, H: 7.970000, L: 7.502000, C: 7.890000, V: {"$numberLong": "507537"}}
[14:26:33] <rc3> but mongoimport gave me this error: exception:BSON representation of supplied JSON is too large: Invalid use of reserved field name: $date
[14:27:12] <rc3> options used in mongoimport: ... --type json --file test.json
[14:29:10] <StephenLynx> where did you get that json from?
[14:29:39] <rc3> it's the result of an awk command which read data from a zip file
[14:29:40] <StephenLynx> and using a $ indicated the field is an operator.
[14:29:49] <Mongodbuser01> Hi there, my mongod.log keeps filling up with this error and I can't work out how to fix it http://pastebin.com/MMA1stwg ... does anyone have an idea of where to start looking?
[14:30:16] <rc3> if you look at the bottom of http://www.tin.org/bin/man.cgi?section=1&topic=MONGOIMPORT, the format for date column I used is very similar
[14:31:08] <rc3> @StephenLynx correct I need to convert the date string '2015-01-30' for example into bson date format
[14:32:40] <rc3> that's fine, here's example given on via that link:
[14:32:41] <rc3> For example, to preserve type information for BSON types data_date and data_numberlong during mongoimport, the data should be in strict mode representation, as in the following: { "_id" : 1, "volume" : { "$numberLong" : "2980000" }, "date" : { "$date" : "2014-03-13T13:47:42.483-0400" } } For the data_numberlong type, mongoimport converts into a float during the import.
[14:33:24] <StephenLynx> I can spot a couple of differences. first you don't have an _id field, but I doubt that is the problem.
[14:33:41] <StephenLynx> second you don't have the fields quoted.
[14:33:49] <rc3> I also tried with another format "2015-01-30T00:00:00.000-0000" for date field but doesn't help
[14:34:19] <rc3> that might be it, let me add quotes
[14:35:15] <StephenLynx> I dont know mongoimport though, just saying what I saw of different.
[14:37:34] <rc3> added both _id (unique) and double quotes for field names, same error
[14:49:28] <StephenLynx> yup, making sure the tool was in working condition has saved me a number of times.
[16:18:17] <digicyc> What would be the best way to query a collection that has a field with a list of key/values. ex: {dynamicattributes: [{"key": "blah"}, {"key": "foo"}]}
[16:18:51] <digicyc> I need to find all documents that has a "key": "blah"
[16:19:01] <StephenLynx> you want to query for the key?
[17:55:50] <mick27> is it possible to get from a two replica set cluster to only one ?
[17:55:56] <mick27> when I remove the second as per the doc
[17:56:08] <mick27> the remaining one goes into secondary
[18:01:15] <tplaner> hey guys, I'm able to connect to mongo just fine via the CLI, however when I attempt to connect through php on the same server I get "Authentication failed on database"
[18:02:47] <tplaner> I'm running pecl version 1.6.1 if that offers any assistance
[18:07:31] <GothAlice> mick27: You're running into the "network partition" issue. MongoDB, to be safe, will have the primary degrade to a read-only secondary if it determines it can't talk to 50% or more of the other hosts. When you have two shutting down one = 50% of the hosts, and it freaks out. To resolve this, add an "arbiter" which is a voting member that doesn't actually store data. Then the primary will see the arbiter and itself (66.6% of hosts) and
[18:08:34] <GothAlice> mick27: Basically the primary can't tell if *it* lost its network connection, or if the secondary did, has no way to resolve the issue, and goes into self-protection mode.
[18:15:32] <wolfsburg18> I have a server where mongo is failing to start, when i try to manually load with mongod -f /etc/mongod.conf --setParameter failIndexKeyTooLong=0 I get the error about to fork child process, waiting until server is ready for connections.
[18:23:04] <wolfsburg18> @GothAlice: would you have any idea why mongo would fail to start with the issue I noted?
[18:23:48] <GothAlice> wolfsburg18: Could you gist/pastebin your mongod.conf and any relevant logs?
[18:25:10] <wolfsburg18> Here is the config. No logs because mongo is not writing to the log file at all, part of what has me completely stumped. http://pastebin.com/wMfyVvv9
[18:26:02] <wolfsburg18> I just upgraded to 2.6.7-1 running Centos 6.6
[18:26:05] <GothAlice> For testing purposes, set fork = false and verbose = true.
[18:27:09] <GothAlice> wolfsburg18: Also, replSet = rs0 -- is this _actually_ a member of a replica set?
[18:27:51] <wolfsburg18> Yes this is a member of a replica set
[18:28:11] <wolfsburg18> Removed fork and added verbose to conf.
[18:28:56] <GothAlice> Either you are running mongod as a user without permission to write there (i.e. non-root) or /var/log/mongo doesn't exist. Maybe /var/log/mongodb?
[18:29:17] <wolfsburg18> should be /var/log/mongodb
[18:35:51] <GothAlice> wolfsburg18: Was that the issue?
[18:36:37] <wolfsburg18> Not sure but now with the log file I am able to make more progress... thanks for pointing out my typo... most recent error is complaining about /srv/mongodb/mongod.lock
[18:37:27] <wolfsburg18> almost seems that someone started mongo with the root user account vs using the script to start mongo and that caused most of the pid and lock files to change to the root owner vs the mongod user...
[18:37:41] <GothAlice> That would certainly be a problem.
[18:41:42] <wolfsburg18> That was certainly the issue GothAlice, once all the lock and pid files were updated to be owned by mongod and verified all the databases were owned as well by mongo it started right up. Thanks
[18:50:00] <mannoj> Any major high level comparison between MongoV2.6 and Mongo 3.0 (Yet to release) and Tokumx? MongoV2, V3 and Tokumx
[18:53:11] <GothAlice> Being an open-source fork of MongoDB, some TokuMX changes could likely be back-ported. (Notably having a paging scheduler to reduce disk I/O would be nice.) Some of the other things (compression, document-level locking, behind-the-scenes transactions, etc.) will be present in 3.0. I don't have a concrete comparison on-hand though, mannoj, sorry.
[18:53:27] <mick27> GothAlice: I missed your reply, thanks ! actually it makes sense, I wonder why I did not see it before. can I set my remaning host as an arbiter in some way ?
[20:31:14] <GothAlice> nebo: I've worked on medical records systems; transactional integrity is somewhat important, and not something MongoDB does. (I.e. marking someone as having had vaccination X, the write failing, then them getting it a second time could be bad for their health.) Logs, OTOH, are a great use for MongoDB's capped collection feature, and something I use a lot.
[20:31:41] <Sam___> GothAlice: you would use mongo over elastic for logs?
[20:32:40] <GothAlice> Simplicity of architecture. I use MongoDB for all sorts of things, including low-latency push queue (capped collection & tailing cursors), bulk file storage (26 TiB and counting), etc., etc. I only need to worry about scaling a single service.
[20:32:47] <GothAlice> Also I have structured log data.
[20:33:34] <Sam___> the ELK stack is very simple......I would say almost as simple as MongoDB....
[20:34:12] <nebo> GothAlice: You're using MongoDB as a file store? What are the pros and cons of storing files there as opposed to the native file system?
[20:34:19] <GothAlice> It's convenient to have every log, not just syslog, go to a single destination. For example, this is how I get structured logging output from Python: https://gist.github.com/amcgregor/2bfbbeac18d4f2d6e173
[20:35:39] <GothAlice> nebo: Scalability. I don't worry about disks any more. (I have many, but their existence and management requires literally zero effort, I just swap drives when they fail or replace the smallest in the arrays when the space gets tight.) I'm also gathering data ("files") from a multitude of sources: transparent HTTP proxy, IRC bouncer, even "interesting" sniffed packets from my border gateway.
[20:37:00] <GothAlice> Because all of this data is, for the most part, structured, I have no need for a typical filesystem hierarchy. (Exposing this dataset as a FUSE filesystem, which I do, results in a filesystem with an infinite number of files an infinite number of "directories" deep.)
[20:37:16] <nebo> GothAlice: What about filesystems such as ZFS and GlusterFS?
[20:38:35] <GothAlice> nebo: And I'd store all of my metadata as extended attributes? How would I query it efficiently? I don't follow a typical "find and read" usage pattern with my dataset.
[20:39:11] <GothAlice> I was sad to see WinFS die during the Longhorn betas… it's the closest thing to what I'm doing.
[20:54:37] <GothAlice> nebo: Queries like: "Show me everything recorded from the same site as the first recorded file that is in the 95th percentile by access count." (t'was AdCritic…) or "What projects was I working on when I first discovered MashupGermany on Soundcloud.com?" (RITA, Marrow*) or "How much time elapsed between the first time I saw MashupGermany and first time I played a song by them?" (two weeks)
[20:56:54] <bybb> I've some difficulties running a replica set
[20:57:15] <bybb> I keep getting "No primary found in set" even if I wait
[20:58:16] <bybb> rs.status() says there isn't any primary, just a "SECONDARY" and two "UNKNOWN"
[20:59:37] <bybb> how can I make it changes its state?
[20:59:39] <GothAlice> bybb: http://docs.mongodb.org/manual/tutorial/deploy-replica-set/ is one method to set up a replica set. At what point in the process are you blocked?
[21:00:27] <bybb> GothAlice everything is deployed (was deployed yesterday)
[21:00:43] <bybb> today I try to actually use it with a Node.js app
[21:01:07] <GothAlice> Confirm (via logs on all mongod hosts) that the nodes can talk to each-other.
[21:01:39] <GothAlice> It may be important to ensure DNS resolution works, and that firewalls (rules, rate limiting, etc.) aren't preventing them from talking.
[21:02:14] <bybb> it worked yesterday, but you're right, I'll have a look
[21:07:57] <pretentiousgit> o/ Does anyone have experience with a "SyntaxError: Unexpected token o" failure in connect-mongo?
[21:08:26] <pretentiousgit> Specifically: "SyntaxError: Unexpected token o
[21:08:26] <pretentiousgit> at Object.parse [as unserialize] (native)"
[21:09:41] <bybb> GothAlice well I don't understand the logs
[21:17:01] <Mannoj> Does mongodb has fractal indexing like Toku?
[21:17:23] <GothAlice> Mannoj: Not at the current time. When 3.0 is released, Toku will release a data storage back-end plugin to bring support to MongoDB proper.
[21:17:37] <GothAlice> Patent issues on their indexing algorithms, yay!
[21:18:16] <Mannoj> so in 3.0 it will have fractal-tree indexing I guess?
[21:18:30] <GothAlice> As a third-party plugin, yes.
[21:19:22] <dacuca> are you sure they are gonna release that plugin?
[21:19:50] <GothAlice> dacuca: http://www.tokutek.com/tokumx-for-mongodb/ < scroll to the bottom of their official site
[21:20:15] <GothAlice> http://www.tokutek.com/2014/12/yes-virginia-going-tokumx-storage-engine-mongodb-2-8/ also the blog
[21:21:32] <bybb> GothAlice They can ping and telnet to each other
[21:21:46] <GothAlice> bybb: Yup, got that from the log. ;)
[21:21:57] <GothAlice> bybb: But I still need to see the _other_ two mongod process logs.
[21:24:37] <Mannoj> you mean to say Toku has shared their plugin to have it with MongoDB 3.0 under 10GEN that will have fractal index and Log-structured Merge-trees ?
[21:27:57] <GothAlice> bybb: Uhm, somehow it looks like you have nothing but secondaries that have never had a replica set initiated. rs.initiate() and rs.add() on the one you want to be the default primary, which I assume is the one the first log came from. It may be simplest to "nuke it from orbit" and re-deploy the entire replicaset if you don't have data you want to save. (I doubt there's data, none of the hosts have journals.)
[21:28:24] <Mannoj> From your answer-> When 3.0 is released, Toku will release a data storage back-end plugin to bring support to MongoDB proper. ->> So MongoDB 3.0 will be released under 10GEN that will have fractal indexing?
[21:28:53] <GothAlice> Mannoj: The question still makes no sense. 10GEN release MongoDB 3.0. Toku release their plugin. That's it.
[21:29:00] <bybb> GothAlice Do I need to do it each time I start the server? (they're Vms)
[21:31:58] <bybb> Yeah Sam__ told me about AWS yesterday :)
[21:32:06] <GothAlice> Alas, that doesn't inform me on how you've set it up. (EC2 at least has a some presets I could extrapolate from. ;)
[21:33:14] <bybb> Well I've a storage linked to my HD
[21:33:15] <GothAlice> OTOH, I don't bother with the carbon footprint of virtual machines for local testing. It's simply unnecessary. :/
[21:38:26] <GothAlice> bybb: For local testing of a 2x3 sharded replica set with authentication I use https://gist.github.com/amcgregor/c33da0d76350f7018875, a handy script I hacked up for the purpose.
[21:41:08] <bybb> GothAlice you're going too far for me :)
[21:41:18] <bybb> I'll need a week to understand this
[21:41:28] <GothAlice> Read from bottom-to-top. ;)
[21:42:05] <GothAlice> Lines 139 and 140 indicate it's spinning up two "shards" with three replica set members each. (Nested for loop FTW.)
[21:42:44] <CAB_> Folks, when is Mongo3.0 stable would get released?
[21:42:50] <GothAlice> The start_shard function (lines 29-40) just runs mongod with some custom port numbers automatically determined using math.
[21:42:54] <GothAlice> CAB_: When it gets released.
[21:43:58] <CAB_> hehe.. I tot this is an official chat for mongo..
[21:45:05] <bybb> GothAlice I don't really want to bother you with XML (or at all), but can it help? http://pastebin.com/M2ShbMk6
[21:45:12] <GothAlice> bybb: To test, save that script somewhere and edit lines 11 to 13 to point to somewhere you'd like the data, PID files, and logs to go. chmod +x the script, and run it with "start" as an argument. On first run it'll do all the rs.initiate() rs.add() stuff needed to get the cluster running.
[21:45:56] <GothAlice> bybb: You'll want to change that network from NAT to Bridged.
[21:46:57] <GothAlice> bybb: Because NAT is what your consumer-grade firewall/router will do, it hides the VM behind a firewall.
[21:47:25] <GothAlice> You really want the VM to appear on the same network as the host machine in this instance, esp. since you'll need several VMs to talk to each-other without restriction.
[21:47:48] <GothAlice> (As a note, I *never* use VirtualBox's NAT mode. Ever. Period. It causes more problems than it is worth. ;)
[21:48:37] <bybb> I used NAT and Host-only Adapter (took me two days...)
[21:49:36] <bybb> GothAlice Thanks a lot, I'll give it a try tomorrow
[21:50:20] <GothAlice> bybb: Remember to save a copy of my script somewhere. It could save you days of work and the overhead of needing virtual machines for local use. :)
[21:58:36] <awsmsrc> one of the records has a nil value but the guard query should filter that?
[21:58:37] <awsmsrc> no time information in "this.last_disconnected_at"
[21:59:36] <GothAlice> Well, the .ne => nil should work, but alas I don't Ruby. :/ Are you certain this.last_disconnected_at is nil and not something like an empty string?
[21:59:50] <GothAlice> (An empty string would pass those .where() clauses.)
[22:00:27] <GothAlice> The syntax difference between "where(:last_disconnected_at.exists => true)" and "where(last_connected_at:{"$gte" => "this.last_disconnected_at"})" is blowing my OCD out of the water. XD
[22:01:13] <awsmsrc> the syntax difference is annoying but …. ruby
[22:01:22] <awsmsrc> ill add a nil string check just incase
[22:01:36] <GothAlice> Yeah; armouring your code against bad data is almost universally a good thing.
[22:02:06] <GothAlice> But in cases like this always try to find out exactly what your abstraction layer is sending to MongoDB, and try to manually formulate that query using the mongo shell. (I.e. remove abstraction from the equation.)
[22:03:07] <awsmsrc> Ive specified date on the field in the model, im nearly 100% confident its setting it to nil
[22:03:42] <awsmsrc> im no mongo expert but if that query would work in pure mongo right
[22:03:49] <awsmsrc> im not missing a mongo concept?
[22:04:38] <GothAlice> Again, the only true way to verify what's going on is to do it in the mongo shell, without abstraction. Ignoring the repeated use of the word "where" in the abstraction (which is distinctly *not* what it's doing behind-the-scenes, or shouldn't be, 'cause that'd be terrible!) yes, that query is fully implementable using raw mongo (nested dictionary) syntax.
[22:19:04] <awsmsrc> GothAlice: the thing is im checking against two dates on that record
[22:19:11] <awsmsrc> it would be easy if i could pass a date in
[22:19:30] <awsmsrc> im basically checking whos online by seeing if there last connection is greater then ther last connection
[22:51:46] <GothAlice> Nilium: Certain operations are "easier" using timestamps or Julian dates. Notably: simple date arithmetic that can then ignore concepts like "months".
[22:52:55] <GothAlice> awsmsrc: I assume you have some way of handling users that disappear without first saying a polite goodbye? (A la the people on IRC here "disconnecting" after 280 seconds of not actually being connected and things like that?)
[22:53:59] <awsmsrc> sort of, its adistributed system and eventually consistant
[22:54:17] <awsmsrc> this data is only for an admin dashboard so doesnt have to be correct to the minute
[22:54:38] <GothAlice> Did you find a way to log what your abstraction layer is actually saying to MongoDB?
[22:54:57] <awsmsrc> nope, i just cached an online attribute on the model and queried against that